Let's Write Some Bad Ruby

ricardobeat · on Oct 26, 2015

I (frontend/javascript developer by trade) wholeheartedly recommend learning the standard unix tools - bash, sed, grep, awk, xargs - for these tasks.

While the syntax is daunting at first, and escaping will be an eternal problem, eventually it starts making sense and you'll be able to do anything in seconds. The work in the post becomes trivial:

    mv app/views/foo/**/* app/views/
    sed -i 's/\([a-z]*\)foo_\([a-z]*\)path/\1\2path/' app/views/**/*

or in a more imperative (and probably less efficient) style:

    for file in app/views/foo/**/*; do
        sed -i 's/\([a-z]*\)foo_\([a-z]*\)path/\1\2path/' $file
        new_path=$(sed 's,foo/,,' <<< $file)
        mkdir -p $(dirname $new_path)
        mv $file $new_path
    done

junke · on Oct 27, 2015

I don't do this anymore. I learnt how to use bash, sed, grep, awk, netcat, socat, and as a user, I work in terminals. I accumulated a lot of scripts, to the point where they were the first set of tools I would reach.

But it becomes a self-renforcing loop: the more you rely on this toolbox, the more problems you create that are best solved with the same tools and the particular mindset they promote. But for work-related tasks, throwaway code rarely is: your code becomes part of your infrastructure and you have to make it work reliably now and later.

After some years, things are not so fun anymore: assembling strings means escaping/quoting characters for different formats and languages. With shell scripts, the path of least resistance is unsafe and introduce many unneeded assumptions.

For example, in the above code, globbing in app/views/foo/... in the "for" expression will not work if you have spaces in file names. And if you can get it right, $file is not inside double-quotes; mv will erase any previously existing file. And even though some of the assumptions are valid in context, will they always hold?

In order to make them hold, people tend to adapt their problem to fit their tools, not the other way: file names are always in a well-known ascii range, with no space but a specific separator and zero-padding (e.g. myfile_00020, because files are sorted lexicographically). Unless you are willing to use json/xml tools, data is cut into pieces of strings: records are line-separated; each field is colon-separated, each subfield is comma-separated, etc (recursive CSV). And one day, that field which always contain dates in YY/DD/MM format (why not ISO-8601?) starts having a time too, formatted HH:MM. After a painful recovery of trashed data, you add "just one more script" to properly escape colons.

Languages are designed with specific goals in mind and writing complex programs is not what scripting languages are optimized for. I now prefer to use language with data-structures, functions, objects, namespaces: less use for strings and regexes. If I try to write onto an existing file, I will be warned (or I can explicitely allow overwriting). Paths are organized as os-independant trees, etc: there are better interfaces to the facility provided by the OS. Coincidently, the need for scripts gradually reduced, at least the one I need to save (I still chain programs on the command-line).

lisivka · on Oct 27, 2015

Globing works fine with spaces, but you need to quote variables, e.g. $file -> "$file", to avoid parsing of their content by bash. See Bash-FAQ for details.

Problem with shell is that you need to solve same problems again and again. This is why I wrote "bash-modules" script (https://github.com/vlisivka/bash-modules), which allows to write module with common code and then just do ". import.sh module" from script or command line.

junke · on Oct 27, 2015

Yep, right, I forgot: globs are expanded after IFS word splitting. I retract my comment about globbing not working with spaces, and instead I'll add this one:

    app/views/foo/**/*

If the above does not match any file and you are not lucky enough to use nullglob, your script iterates once in the loop with the value bound to the pattern itself! (I learnt this from http://www.dwheeler.com/essays/filenames-in-shell.html, which is quite informative).

I honestly commend you for writing a Bash library to solve problems people can have with bash ("Fight fire with fire"). I looked for other libraries, by curiosity, and unsurprisingly there are a lot of them. See this list: http://elinux.org/Scriptin.

For example, http://marcomaggi.github.io/docs/mbfl.html has a function to split a pathname into different components, which definition is here: https://github.com/marcomaggi/mbfl/blob/master/src/modules/f....

I am not criticizing, just stating that it looks painful to write and that I don't wont to endure this.

lisivka · on Oct 27, 2015

Many CLI commands are working differently with and without arguments, e.g. "cat foo" (read from file) and "cat" (read from stdin), so "nullglob" is dangerous option. Just check your data instead. It is bad idea to run sed on /dev/sda, anyway.

My "bash-modules" project is wrapper around libraries. It solves common problem: how to import library (where it is located, /lib, /usr/share, /usr/lib, /etc, /usr/local, /usr/local/share, /usr/local/share, /opt/..., /home/..., etc.). Instead of path hell, you can just type ". import.sh ..." in CLI, script, or library. It also should solve problem with versions in common way (via symlinks from exact version to general name, e.g. args-1.0.2.sh -> args.sh), but I have no free time to add that.

falsedan · on Oct 27, 2015

How about a nice game of golf?

    find app/views/foo -depth -type f | while read f; do mkdir -p $(dirname ${f##foo/}); sed <"$f" >"${f##foo/}" -e s'/\([a-z]*\)foo_\([a-z]*\)path/\1\2path/g'; echo rm "$f"; done

lisivka · on Oct 27, 2015

It is better to learn perl instead of sed/awk. It is much more powerful and cleaner than both of them:

  perl -pi -e 's{(.*?)foo_(.*?)path}{$1$2path}' app/views/*/*

kr0 · on Oct 26, 2015

I can't recommend this enough. I was afraid of the shell even having used it quite a lot, preferring to do my changes (like these) with an IDE, and later Perl.

jcummings86 · on Oct 27, 2015

As the author I thought I'd comment:

I wrote this article, because when I was a very new developer, I was learning a lot more Rails than I was Ruby. It had never occurred to me to use Ruby for scripting until I paired with a well rounded Rubyist. With the influx of people learning Rails, I've talked to many people who know much less about Ruby than I did when I was starting out. This isn't mentioned in the article at all, but it was definitely something I thought about when I decided to write it. Honestly, if this helps push one new dev to go out and decided to actually learn some extra Ruby outside of Rails, I'll be happy.

I agree 100% about the shell comments. I know enough shell to do my job, but for me it's quicker to write a Ruby script. Not saying it's the most efficient way, but it is a way, and it's more efficient than doing it by hand. As much as I'd love to be a unix shell wizard, I'd rather be improving my Javascript, or learning more Elixir.

As for the namespace vs scope: The example was just a way to end up with a scenario where we needed to change some text. Plus, for clarity, if I don't want my controllers or URL namespaced, my routes and files should reflect that, not work around it with a scope unless necessary for some reason(in my opinion). I actually later thought that I should've used the script on a JS project, just to help highlight the fact that this is totally independent of Rails.

DIVx0 · on Oct 26, 2015

I have a scratch folder full of stuff like this written in whatever scripting language is closest at hand, ruby, perl, python, etc. I'd wager most people here would have something similar.

I wonder if it would be an interesting project to comb though it all and see if any useful tools could be extracted. Probably not, like the code in this blog most of my one-offs are unashamedly terrible.

voltagex_ · on Oct 26, 2015

I wholeheartedly recommend anyone to comb through their samples and post them on github - https://github.com/voltagex/junkcode is mine. Occasionally I'll graduate projects to their own repo.

This was inspired by http://samba.org/ftp/tridge/talks/junkcode.pdf - you may recognise the author!

n0us · on Oct 26, 2015

I didn't really think it was that terrible or hard to understand. I have minimal experience with Ruby and it made sense to me. Doing this sort of thing was why I learned how to program in the first place.

dopamean · on Oct 26, 2015

Couldn't agree more. If you write something for one time use and it works how could it really be bad?

anon4 · on Oct 27, 2015

I have this dumb C program that scans a folder of .mp3 files and fixes their id3tags from shift-jis encoding to utf-8. Shift-jis isn't officially supported for mp3s, but Japanese Windows uses it as the default encoding for non-UCS2 strings, so when you display a tag that's "latin-1 encoded" on a Japanese locale, it shows up fine, but shows up as gibberish on everything else - https://github.com/moshev/tagfixr

chestervonwinch · on Oct 26, 2015

> most of my one-offs are unashamedly terrible.

I never intended this to see the light of day, but since the topic warrants it, here is a python one-liner I'm using to put the indices of combinations of subsets into one list:

combos = map(list,reduce(lambda x,y: x+y,[[c for c in itertools.combinations(range(8),i)] for i in range(1,9)]))

yuck. These things usually end up as a result of testing and tweaking in an interactive shell.

jgwmaxwell · on Oct 26, 2015

Whilst not entirely the point of the article, this could have been accomplished without moving any files by changing the route definition to a scope from a namespace in Rails...

joshmn · on Oct 26, 2015

I had the same thoughts. Why would a company like HashRocket use such an example? Or did they just not know? I don't know.

Breefield · on Oct 26, 2015

lol I know right, just enough knowledge to be dangerous...

that may come off as mean/rude towards the author, but I don't mean it so. I do shit like this all the time because I don't know every little trick with the languages/frameworks I'm using, it's impossible to know it all unfortunately.

cortesoft · on Oct 26, 2015

I think the key point is that it is 'impossible to know it all'

I don't think I have ever looked at code I have written more than a few months ago and NOT found a bunch of things I could have done much simpler if I had just known about some feature or trick. I have stopped feeling bad when that happens; now, I figure that the day I stop experiencing that is the day I am no longer getting better as a programmer.

lighthawk · on Oct 26, 2015

This is the sort of thing a combination of bash/grep/awk/sed is good for also.

kondro · on Oct 26, 2015

True, but if you write Ruby all day and are most familiar with it, a script like this takes a matter of minutes to write compared to trawling the man pages of grep/awk/sed to implement the same thing you're only going to throw away after execution anyway.

ricardobeat · on Oct 26, 2015

After you are familiar with the shell tools, it takes seconds to write something like this, not minutes.

sigzero · on Oct 26, 2015

Sure if you control the environment and can install Ruby. grep/sed/awk are always installed.

dragonwriter · on Oct 26, 2015

Presumably, if you write ruby all day, there is a good chance that the environments you encounter have ruby installed, as well (and they might be even be Windows environments, in which case grep/sed/awk are not "always installed.")

omarqureshi · on Oct 26, 2015

Portability becomes a problem though. GNU vs BSD sed/grep :(

adrusi · on Oct 26, 2015

They do have a shared subset of features defined by POSIX. It's limiting and takes discipline, but it's not really impractical.

mirceal · on Oct 26, 2015

whatever gets the job done faster. if you treat it like throw-away it does not really matter.

smithkl42 · on Oct 26, 2015

I don't know that I could have asked for a better explanation as to why my preferred environment is C#, Visual Studio and Resharper.

muzmath · on Oct 27, 2015

I thought the explanation is that you work for a .NET shop?

smithkl42 · on Oct 29, 2015

Sort of. I've been the CTO and first tech employee at my last four gigs, which means that they're .NET shops because I wanted them to be .NET shops. Correlation and causation :-).

stevepike · on Oct 26, 2015

Another awesome tool if you're using emacs is wgrep mode. You get the results of `grep` in a buffer, can edit it like any other text in emacs, and then save writes back to all the original files. Really cool.

Emacs macros are wonderful too.

bad_login · on Oct 27, 2015

Before i thought mastering regexp were a must to have in a day to day programmer job to quickly modify data with different syntax.

Now i am still convinced that regexp are still usefull but you will not get it right for the first try due to syntax corner case. The best solution in my opinion is to have easy to use parsers including one for your main programming language.

Actually i emacs with its macro feature (not elisp) + racket with regexp (didn't mention earlier but having a strong repl for this kind of incremental script is also a must to have) and sometimes parsers.

Ch_livecodingtv · on Oct 28, 2015

This probably was the start of writing ruby. I'd recommend you guys watch this video of someone writing bad ruby in contrast. https://www.livecoding.tv/video/rails-refactoring/

_pfxa · on Oct 27, 2015

If these lot spend two minutes to understand that code at the end, I'd not depend them for my code. He's just processed some text, that's all, don't bother reading.

tyho · on Oct 26, 2015

Or use a language with good tooling: https://godoc.org/golang.org/x/tools/cmd/gorename