Sometimes you want to order an array of strings with embedded numbers. But if we use the sort method from the array class we can have some big surprises (some information has been deleted for clarity):
irb>foo = ['foobar', 'a13bb32', 'a10', 'a13bb2c10',
'a2', 'a3', 'a13bb2c9', 'a13', 'a20', 'a13bb2',
'a13bb1', 'a13bb10', 'a13bb20', 'a13bb3', 'a13bb01']
irb>puts foo.sort
a10
a13
a13bb01
a13bb1
a13bb10
a13bb2
a13bb20
a13bb2c10
a13bb2c9
a13bb3
a13bb32
a2
a20
a3
foobar
The problem is that the sort method from the Array class cannot "see" the embedded numbers. It only sorts comparing each characters one at a time (usually with the ASCII numbers for the priority), so that 'a13' comes before 'a2' (because '1' comes before '2', the sort method doesn't compare '13' with '2'). Because we can open Ruby classes easily, we can add a new way to sort our strings with embedded numbers:
# --------8<--------
class Array
# Method which sort an array composed of strings with embedded numbers by
# the 'natural' representation of numbers inside a string.
def natcmp
reg_number = /\d+/
# We call the sort method of the Array class.
self.sort do |str1, str2|
# We try to find an embedded number
a = str1.match(reg_number)
b = str2.match(reg_number)
# If there is no number
if [a,b].include? nil
str1 <=> str2
else
while true
begin
# We compare strings before the number. If there
# are equal, we will have to compare the numbers
if (comp = a.pre_match <=> b.pre_match) == 0
# If the numbers are equal
comp = (a[0] == b[0]) ? comp = a[0] + a.post_match <=> b[0] + b.post_match :
comp = a[0].to_i <=> b[0].to_i
end
str1, str2 = a.post_match, b.post_match
a = str1.match(reg_number)
b = str2.match(reg_number)
rescue
break
end
end
comp
end
end
end
# Same as 'natcmp' but replace in place.
def natcmp!
self.replace(natcmp)
end
end
# --------8<--------
So now, if we try our new method, here's the result:
irb>puts foo.natcmp
a2
a3
a10
a13
a13bb1
a13bb01
a13bb2
a13bb2c9
a13bb2c10
a13bb3
a13bb10
a13bb20
a13bb32
a20
foobar
We can even use the 'natcmp!' method if we want to change the array in place. I hope this will help somebody.
Tuesday, May 1, 2007
Subscribe to:
Posts (Atom)