Strings
Enumerable – see all at Ruby Doc: String
- .count => returns the number of characters
- .include? ‘sub’ => returns true/false if sub found
- .scan(/regexp/) => returns array of array with all occurrences found
- .split (/pattern/)=> returns array of sub strings, with whitespaces
- .sub(/regexp/, replacement s) => replaces first occurrence, returns string
- .gsub => replaces all occurrences, returns string
- .match (/pattern/, pos) . captures =>search from pos returns matchData
.char => converts into an array of characters - .slice(arg) => arg can be integer, range, regexp, string – returns sub string or nil
- .slice!(arg) => returns deleted part
- .capitalize, .downcase, .uppercase, .swapcase: manipulate case
- .center (n), .ljuts, .rjust=> adds whitespace like padding
.replace(scalc) =>replace a string with a string already calculated
Examples
Slice a string to isolate the part of interest:
a = "hello hello my dear" >> a.slice(-7,8) => "my dear"
Split a string using regular expressions patterns:
We are given a string with multiple records of interest. Let’s split each substring using a pattern.
s = “Mon 03:00-9:00 Tue 12:00-24:00 Mon 15:00-18:00″
regxp=”([A-Z]{1}[a-z]{2}\s\d{2}[:]\d*[-]\d*[:]\d*)”
regxp=”([A-Z]{1}[a-z]{2}\s\d{2}[:]\d*[-]\d*[:]\d*)”
s_split = s.split(/regxp/)
=>[“”, “Mon 03:00-9:00″, ” “, “Tue 12:00-24:00″, ” “, “Mon 15:00-18:00”]
s_split.delete(” “)
s_split.delete(“”)
s_split.delete(“”)
=> [“Mon 03:00-9:00”, “Tue 12:00-24:00”, “Mon 15:00-18:00”]
Regular expression pattern:
- uppercase letters (1) + lowercase letters (2) => [A-Z]{1}[a-z]{2}
- space (1)=> \s
- integers=>\d{2}
- special char => [:]
- integers => d*
- special char => [-]
Tip: Test this pattern at rubular!
Cleaning Whitespaces
s="There is an extra space here. A change of line as well. \nThird line here."
s.sub(/\s{2}/, ' ') #replace 2 spaces by one. s.gsub(/\n/, '')
What counts as whitespace? see discussion in Ruby Cook Book
Extracting a substring between tags < and > (see: Stockoverflow)
s=”<ants> <pants>”
simplest regexp= <(\S+)>
< ( ) > =>capture chars between tags
+ => zero or one
\S => any non whitespace character
+ => zero or one
\S => any non whitespace character
other regex working too = <([^>]*)>
[^>] => any character except closing tag
* zero or more of
.scan
will return an array of arrays of all matches:subs=s.scan(/<(\S+)>/)
=> [[“ants”], [“pants”]]
=> [[“ants”], [“pants”]]
.match will return the first match as a
=> #<MatchData “<ants>” 1:”ants”>
MatchData
match
=s.match(/<(\S+)>/)=> #<MatchData “<ants>” 1:”ants”>
.match will return the 2 occurrences with this regexp
.match=s.match(/<(\S+)> <(\S+)>/)
#<MatchData “<ants> <pants>” 1:”ants” 2:”pants”>
#<MatchData “<ants> <pants>” 1:”ants” 2:”pants”>
.slice will return first match including its surrounding tags
s.slice(/<(\S+)>/)
=> “<ants>”
.split will returns array of matching strings including whitespaces
s.split(/<(\S+)>/)
=> [“”, “ants”, ” “, “pants”]
s.slice(/<(\S+)>/)
=> “<ants>”
.split will returns array of matching strings including whitespaces
s.split(/<(\S+)>/)
=> [“”, “ants”, ” “, “pants”]
To retrieve “ants”:
subs[0][0]
subs.first.first
matchdata[1]
matchdata.captures[0]
To retrieve “pants”:
subs[0][1]
subs.last.first
match
=s.match(/<(\S+)><(\S+)>/)
match.captures => [“ants”, “pants”]
match[1] =>”ants”
more about MatchData at geeks for geeks