Hacker News new | past | comments | ask | show | jobs | submit login
Random Numbers in Bash (gist.github.com)
85 points by andy99 on March 27, 2023 | hide | past | favorite | 57 comments



The go-to source for randomness in Linux is /dev/urandom, is it not?

I’m not sure the best way to make use of it as a source of random decimals — there are surely a dozen ways to do so — but this works:

  printf %d 0x`
    head -c 1000 /dev/urandom
    | shasum 
    | head -c 8
  `
As always, my advice for things like this is that if you’re in need of true random numbers in your shell script then you should probably stop writing whatever it is you are writing as a shell script.


> The go-to source for randomness in Linux is /dev/urandom, is it not?

AFAIK it's the go-to source of randomness in any modern-ish unix.

> I’m not sure the best way to make use of it as a source of random decimals — there are surely a dozen ways to do so — but this works:

`od(1)` will do binary data to decimal:

    > head -c4 /dev/urandom | od -An -vtu4
               4119929390                                                

    > head -c4 /dev/urandom | od -An -vtu4
               3540752584                                                

    > head -c4 /dev/urandom | od -An -vtu4
                998126897


There are several scenarios where a cryptographic-quality random number generator might be needed within the context of a shell script.

My favorite tool for this is "openssl rand" and I can find the manual page on rhel7 with "man rand."

Here are 8 bytes of randomness in base64:

  $ openssl rand -base64 8 
  mGHxYcVOmA8=
If you like decimal:

  $ printf %d\\n 0x$(openssl rand -hex 2)
  21924
When I've got dozens of users to create, this is how I am assigning their initial passwords.

The shell's $RANDOM is somewhat less trustworthy than other sources here. It also does not appear to be POSIX, but comes from ksh88, where it is described as returning a random integer between 0 and 32767 each time it is referenced. If unset in a script, it cannot be used again for the life of the shell.


I use this to create passwords when I need to manually:

    function password_generate { head -c ${1-12} /dev/urandom | base64; }


Sometimes, extreme portability makes us do desperate things.

  # uname -a
  HP-UX prdcrm B.10.20 A 9000/800 862741461 two-user license

  # ls -l /dev/urandom
  /dev/urandom not found

  # openssl rand -base64 8
  v51U1RvSD2Q=
I could run this on VAX VMS if you want.


While being overly pedantic (because "not true" random is a spectrum), it is good practice to use random numbers to ensure that a recurring cron job does not run at the same time on all machines.

This goes from good practice to necessary as a cron job uses increasing resources or talks to a centralized service.

Puppet implements this as splay: https://www.puppet.com/docs/puppet/7/configuration.html#spla...


Systemd offers multiple options for randomized delays for timers, documented at: https://www.freedesktop.org/software/systemd/man/systemd.tim...


You don't need the shell for that. Every modern cron supports randomization internally.


Set RD to $N random bytes in decimal:

  RD=$(dc -e 16i`xxd -c $N -g $N -l $N -p /dev/urandom | tr a-f A-F`p)


Unfortunately, this is neither readable nor particularly concise.


If I recall correctly, /dev/urandom and /dev/random work the same now.


That change was reverted, and I do not know if they were re-unified since.

https://lwn.net/Articles/889452/

Pragmatically, they are equivalent for most scripting use cases.


> As always, my advice for things like this is that if you’re in need of true random numbers in your shell script then you should probably stop writing whatever it is you are writing as a shell script.

The use of "true" here seems ambiguous.

Does this statement mean that you discourage creation of cryptographic applications with shell scripts, you discourage use of shell scripts to orchestrate applications that use /dev/urandom for any reason, or something else entirely?

(first, and only edit: This comment is in good faith, so if you're downvoting please respond to let me know why)


"True random" as opposed to "pseudo random", which is not cryptographically secure.


There are cryptographically secure pseudorandom generators. In fact all sources non-hardware (i.e. /dev/random and /dev/urandom) are psuedo random as are the keystream generators for most algos that use one.


TIL. Thanks!


Yes, I'm aware of that. The commenter seems to be implying that one should stop using a shell script when one needs "true" random, and I'm curious as to why since one can call out to Perl, Python, etc.

The linked gist doesn't appear to be making the case for using its recommendations in cryptographic applications so what I'm ultimately after is what the commenter thinks is being done in the shell that's inappropriate.


Incorrect. I'm not sure what you mean by "true random", and we could get into some interesting philosophical discussion on that.

However, all software random number generators are pseudo random, including cryptographic ones. They are called CSPRNGs, which stands for Cryptographically Secure Pseudo Random Number Generator.


Are you on GNU/Bash version 5.1[0]? A new bash variable, SRANDOM added to GNU/Bash-5.1 release, which gets its random data from the system’s entropy engine and is not lined and cannot be reseeded to get an identical random sequence. For example:

   echo "$SRANDOM"
   for r in {1..5}; do printf "%s\n" "$SRANDOM"; done 
[0] https://www.cyberciti.biz/linux-news/gnu-bash-5-1-released-w...


Or something like this; echo -e "$SRANDOM"{1..5}"\n"


Did you really want to append digits 1, 2, 3, 4, 5 to your high-quality random numbers?


Shameless plug: my bc [1] has a random number generator that is much higher quality than the shell one. Because it's bc, you can generate random numbers of any size, and you can generate random reals as well.

It is also seeded.

[1]: https://git.gavinhoward.com/gavin/bc


Can you give an example usage? I cannot figure out how to invoke your PRNG.


If you want just a number to store in a hardware integer type, use this:

    echo "rand()" | bc
If you want a bounded integer, use this:

    echo "irand(<bound>)" | bc
This bound is adjusted to not bias the random numbers.

It can also be used to generate either really small or really large integers, arbitrary size. My bc will take care of everything behind the scenes.

If you want to generate a real between 0 and 1, use this:

     echo "frand(<places>)" | bc -l
where <places> is the number of decimal places you want. Note the "-l" argument; this function is not built-in like the other two.

If you want an arbitrary real (not between 0 and 1), use this:

    echo "ifrand(<bound>, <places>)" | bc -l
My bc will seed itself from the system CSPRNG, but you can seed the PRNG by assigning the seed to the special variable `seed` first:

    echo "seed = $SEED; rand()" | bc
You can even save the current seed by getting the value of the `seed` variable:

    echo "seed" | bc
If you want to use the same sequence over multiple invocations of bc, you need to save the seed to pass to the next invocation:

    OUT=$(echo "rand(); seed" | bc)
    RAND=$(echo "$OUT" | head -n1)
    SEED=$(echo "$OUT" | tail -n1)
    RAND2=$(echo "seed = $SEED; rand()" | bc)
(Haven't tried the above code. Sorry if it's wrong.)

This pattern works because the `seed` variable is updated after every call to the PRNG.

Does that help?


Thank you, this is perfect!


Fun fact: bash's $RANDOM never yields the same value twice in a row:

  $ for i in {1..999999}; do echo $((RANDOM-RANDOM)); done | sort -nu | grep -E '^-?.$'
  -9
  -8
  -7
  -6
  -5
  -4
  -3
  -2
  -1
  1
  2
  3
  4
  5
  6
  7
  8
  9


Maths isn't my background, but isn't this a serious flaw with the quality of the pseudo-random numbers it generates?

I remember learning about the enigma machine and how it had a flaw where a letter could not be transformed into itself. And knowing that, for example, an A could never be a A was critical to cracking the enigma code. Because of that knowledge, my cryptographic intuition is that any seemingly minor fact like this could actually be a critical flaw.


I doubt it matters much. Bash's $RANDOM is implemented using a linear congruential generator, which is not suitable for cryptographic applications anyway.


I use "openssl rand X" where X is the number of pseudo-random bytes i need.

Say you want to generate a strong password, you can use this handy function that works in bash and zsh:

    function gen_passwd() {
       openssl rand 240 |\
       LC_CTYPE=C tr -dc '[:graph:]' |\
       cut -c 1-${1:-20}
    }


If you have a password manager installed on your system (and you should) then it might have a command-line client that can generate new passwords. e.g.:

    karellen@localhost:~$ keepassxc-cli generate -lU
    hpJUwcYZnzJyAbrnvTWnTzrlqzqcdvsl
    karellen@localhost:~$ keepassxc-cli diceware
    diabetes dinginess goldfish purr cried expenses related
See `keepassxc-cli help generate` and `keepassxc-cli help diceware` for more info, or the man page: https://manpages.debian.org/bullseye/keepassxc/keepassxc-cli...


Or just use what you already have on your machine...

    sort -R /usr/share/dict/words | head -n 4| sed 's/.*/&/;$!s/$// ' |tr '\n' '-' |sed 's/-$/\n/'


Be careful with that:

> The shuf, shred, and sort commands sometimes need random data to do their work. For example, ‘sort -R’ must choose a hash function at random, and it needs random data to make this selection.

> By default these commands use an internal pseudo-random generator initialized by a small amount of entropy, but can be directed to use an external source with the --random-source=file option. An error is reported if file does not contain enough bytes.

-- https://www.gnu.org/software/coreutils/manual/html_node/Rand...

(emphasis mine)

You might end up with a fairly easy-to-brute-force password using that method.

They say "never roll your own crypto", but I think that advice probably holds pretty well for password generators(/managers). I'll stick with using a dedicated program. I reckon it's far less likely to have obvious blunders than something that's just been cobbled together using whatever tools happened to be lying around nearby.


Oh handy - didn’t think of using that word list. I download the EFF diceware list.


    sort -R /usr/local/share/eff_short_wordlist_1.txt |awk '{print$2}' | head -n 4| tr '\n' '-' | sed 's/-$/\n/'


Personally I have a tiny shell function that uses openssl rand with a dice ware list. It accepts a length, and it returns CamelCase.

I use it for usernames since Bitwarden doesn’t support diceware names.


pwgen is great as a password generator, and is a tiny package.


Seems odd that these convenience function don't draw from either /dev/random or /dev/urandom/


I'm pretty sure $RANDOM predates /dev/{u,}random on most OSes.


From the conclusion of the OP:

> I could do more experiments and try to definitively determine what's happening, bottom line, don't use $RANDOM, even for unimportant random numbers.


As hexdump tends to always exist (vs python) I've used the following before:

  hexdump -e '"%d"' -n2 /dev/urandom


I used this to create random decisions for a while:

    if (( RANDOM % 2 == 0 )); then echo Yes; else echo No; fi
I also logged those decisions. I got significantly more No then Yes. I could never reproduce the bias in a loop. But in real life usage, I got more No. Could be by chance of course. But it made me worried enough that I switched to urandom:

    bit=$(od -An -N1 -i /dev/urandom)
    if (( $bit &1 ))
        then echo "Yes"
        else echo "No"
    fi


There's always `head -c20 /dev/random | base64`


likely urandom but the point still stands.


Another fun fact: on Linux systems, the /proc/sys/kernel/random/uuid file contains a random UUID. This can be used to get 32-bit random numbers in POSIX shell without calling out to external utilities. Quite handy on OpenWrt.

read -r < /proc/sys/kernel/random/uuid

MYRAND=$(( 0x${REPLY%%-*} ))


You can also use "shuf" to create random listen port for aria2c.

PORT=$(shuf -i 1000-2000 -n1)

aria2c --listen-port=$PORT ...other...options...


The problem with random as a word is it covers user intents ranging from "crudely load balance well enough most of the time" or "add a small variable delay to smooth the load over time" up to extremely security critical crypto applications.


I assume that the random numbers in bash come from an uniform distribution. Is there a way of getting random numbers from a different distribution with given parameters?

I can only think of one-liners in R or maybe python.


The standard approach is to take the CDF form of the desired distribution, solve it for equaling some constant, then plug in your uniform value and out comes where on the desired distribution that maps to. The hard part is you either have to pre-solve it or use numerical solving methods and in either case you'd have to write that in Bash.


for a normal distribution, chatgpt suggests the box-muller transform and calling out to awk (which is a scripting language that continues to surprise me; it seems underrated?)

    normal_random() {
        start=$1
        end=$2
        range=$(echo "$end - $start" | bc -l)

        awk -v start=$start -v range=$range -v seed=$RANDOM '
        BEGIN {
            srand(seed);
            u1 = rand();
            u2 = rand();
            z0 = sqrt(-2 * log(u1)) * cos(2 * 3.14159265358979323846 * u2);
            z1 = sqrt(-2 * log(u1)) * sin(2 * 3.14159265358979323846 * u2);

            random_number = start + (z0 * (range / 6)) + (range / 2);
            printf("%.0f\n", random_number);
        }'
    }
In testing it out, they did seem normally-distributed but rerunning it quickly did seem to produce more consecutively duplicate results than would be expected, not sure why that would be (I was using ranges of 0-10 and 0-20, perhaps the probability of normally-distributed consecutive dupes in that range is quite higher?)

EDIT: Nope, there's definitely something wrong with this, possibly the rand() call(s)? Can anyone guess? (A real-world example of the perils of relying on chatGPT!) EDIT2: Figured it out. srand with no args seeds to the current second, making any run of this function within the same second produce the same result. Fixing this left as an exercise to the reader EDIT3: I fixed it by seeding it externally via $RANDOM.


AWK is amazing. I recommend reading "The AWK Programming Language", read it as a kid! You can even do network programming with gawk: https://www.gnu.org/software/gawk/manual/html_node/TCP_002fI...


z1 is never used.


Good point! chatGPT explained it thusly:

"The Box-Muller transform generates two normally distributed random numbers, z0 and z1, using the two uniformly distributed random numbers u1 and u2. However, only z0 is used in the provided function to create the random number within the given range. If you want to additionally supply z1 in the output, you can." It then provides the modified code.


On a related note, I've once had to reproducibly generate a random sequence of N-bytes, and one of the more elegant ways I discovered and ended up using was to 'encrypt' a known sequence of bytes with the required length (so that the sequence will be long enough) with some known 'password', and truncating it.

`( yes '00' | head -n 100000 | tr -d $'\n') | xxd -r -p | aespipe -e AES256 -P <(echo 'your-seed')`


Since no-one asked, I use something like this in zsh:

  rand() {
    local r4
    IFS= read -rk4 -u0 r4 < /dev/urandom || return
    local b1=$r4[1] b2=$r4[2] b3=$r4[3] b4=$r4[4]
    print $(( #b1 << 24 | #b2 << 16 | #b3 << 8 | #b4 ))
  }
It has the added bonus of only using zsh, no external dependencies, so it is quite fast.


Anyone want to critique this deterministic random number generator I wrote not long ago in Bash that takes advantage of sha256 sums?

https://gist.github.com/pmarreck/e65f457e8755e461a87db7c94d4...


$((32#$DRANDOM_SEED)) is not correct: the sha256 hash is in base 16, not 32. With your current implementation every fifth bit is always 0.


Ooooh, good find!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: