Thursday, April 28, 2016

How to use regular expression in big data



(defn ws-replce

"Replace whitespace in string  - 'par'"

[param]

(clojure.string/replace param #"[ ]+" "_")

)

My goal is to provide list of matched and un-matched regular expressions to select  only apropriated values from CSV file. Some more examples :
(ns clojure.examples.hello
 (:gen-class))


(def some-quote 
  (str "It was the best of times. "
  "It was the worst of times. It was Friday "
  "night and it was late."))

(def day-pattern #"\w*day")

(defn java-interop-regex
  "just doing Clojure regex with Java APIs"
  []
  (let [day-found (re-find day-pattern some-quote)]
    (println "Is there a day? ... " day-found)))


(def line " RX packets:1871074138 errors:5 1/12/96 dropped:48 overruns:9")
(def ^:dynamic *inc-matcher* (re-matcher #"\d+" line))
(def ^:dynamic *exc-matcher* (re-matcher #"^((0?[13578]|10|12)(-|\/)(([1-9])|(0[1-9])|([12])([0-9]?)|(3[01]?))(-|\/)((19)([2-9])(\d{1})|(20)([01])(\d{1})|([8901])(\d{1}))|(0?[2469]|11)(-|\/)(([1-9])|(0[1-9])|([12])([0-9]?)|(3[0]?))(-|\/)((19)([2-9])(\d{1})|(20)([01])(\d{1})|([8901])(\d{1})))$" line))
( ->
 (re-find *inc-matcher*)
 println  
)
 
( ->
 (re-find #"(\S+):(\d+)" line)
 println  
)


I use this very useful site to test all my regular expressions:
https://regex101.com/


No comments: