The go-to source for randomness in Linux is /dev/urandom, is it not?
I’m not sure the best way to make use of it as a source of random decimals — there are surely a dozen ways to do so — but this works:
printf %d 0x`
head -c 1000 /dev/urandom
| shasum
| head -c 8
`
As always, my advice for things like this is that if you’re in need of true random numbers in your shell script then you should probably stop writing whatever it is you are writing as a shell script.
There are several scenarios where a cryptographic-quality random number generator might be needed within the context of a shell script.
My favorite tool for this is "openssl rand" and I can find the manual page on rhel7 with "man rand."
Here are 8 bytes of randomness in base64:
$ openssl rand -base64 8
mGHxYcVOmA8=
If you like decimal:
$ printf %d\\n 0x$(openssl rand -hex 2)
21924
When I've got dozens of users to create, this is how I am assigning their initial passwords.
The shell's $RANDOM is somewhat less trustworthy than other sources here. It also does not appear to be POSIX, but comes from ksh88, where it is described as returning a random integer between 0 and 32767 each time it is referenced. If unset in a script, it cannot be used again for the life of the shell.
Sometimes, extreme portability makes us do desperate things.
# uname -a
HP-UX prdcrm B.10.20 A 9000/800 862741461 two-user license
# ls -l /dev/urandom
/dev/urandom not found
# openssl rand -base64 8
v51U1RvSD2Q=
While being overly pedantic (because "not true" random is a spectrum), it is good practice to use random numbers to ensure that a recurring cron job does not run at the same time on all machines.
This goes from good practice to necessary as a cron job uses increasing resources or talks to a centralized service.
> As always, my advice for things like this is that if you’re in need of true random numbers in your shell script then you should probably stop writing whatever it is you are writing as a shell script.
The use of "true" here seems ambiguous.
Does this statement mean that you discourage creation of cryptographic applications with shell scripts, you discourage use of shell scripts to orchestrate applications that use /dev/urandom for any reason, or something else entirely?
(first, and only edit: This comment is in good faith, so if you're downvoting please respond to let me know why)
There are cryptographically secure pseudorandom generators. In fact all sources non-hardware (i.e. /dev/random and /dev/urandom) are psuedo random as are the keystream generators for most algos that use one.
Yes, I'm aware of that. The commenter seems to be implying that one should stop using a shell script when one needs "true" random, and I'm curious as to why since one can call out to Perl, Python, etc.
The linked gist doesn't appear to be making the case for using its recommendations in cryptographic applications so what I'm ultimately after is what the commenter thinks is being done in the shell that's inappropriate.
Incorrect. I'm not sure what you mean by "true random", and we could get into some interesting philosophical discussion on that.
However, all software random number generators are pseudo random, including cryptographic ones. They are called CSPRNGs, which stands for Cryptographically Secure Pseudo Random Number Generator.
Are you on GNU/Bash version 5.1[0]? A new bash variable, SRANDOM added to GNU/Bash-5.1 release, which gets its random data from the system’s entropy engine and is not lined and cannot be reseeded to get an identical random sequence. For example:
echo "$SRANDOM"
for r in {1..5}; do printf "%s\n" "$SRANDOM"; done
Shameless plug: my bc [1] has a random number generator that is much higher quality than the shell one. Because it's bc, you can generate random numbers of any size, and you can generate random reals as well.
Maths isn't my background, but isn't this a serious flaw with the quality of the pseudo-random numbers it generates?
I remember learning about the enigma machine and how it had a flaw where a letter could not be transformed into itself. And knowing that, for example, an A could never be a A was critical to cracking the enigma code. Because of that knowledge, my cryptographic intuition is that any seemingly minor fact like this could actually be a critical flaw.
I doubt it matters much. Bash's $RANDOM is implemented using a linear congruential generator, which is not suitable for cryptographic applications anyway.
If you have a password manager installed on your system (and you should) then it might have a command-line client that can generate new passwords. e.g.:
> The shuf, shred, and sort commands sometimes need random data to do their work. For example, ‘sort -R’ must choose a hash function at random, and it needs random data to make this selection.
> By default these commands use an internal pseudo-random generator initialized by a small amount of entropy, but can be directed to use an external source with the --random-source=file option. An error is reported if file does not contain enough bytes.
You might end up with a fairly easy-to-brute-force password using that method.
They say "never roll your own crypto", but I think that advice probably holds pretty well for password generators(/managers). I'll stick with using a dedicated program. I reckon it's far less likely to have obvious blunders than something that's just been cobbled together using whatever tools happened to be lying around nearby.
I used this to create random decisions for a while:
if (( RANDOM % 2 == 0 )); then echo Yes; else echo No; fi
I also logged those decisions. I got significantly more No then Yes. I could never reproduce the bias in a loop. But in real life usage, I got more No. Could be by chance of course. But it made me worried enough that I switched to urandom:
bit=$(od -An -N1 -i /dev/urandom)
if (( $bit &1 ))
then echo "Yes"
else echo "No"
fi
Another fun fact: on Linux systems, the /proc/sys/kernel/random/uuid file contains a random UUID. This can be used to get 32-bit random numbers in POSIX shell without calling out to external utilities. Quite handy on OpenWrt.
The problem with random as a word is it covers user intents ranging from "crudely load balance well enough most of the time" or "add a small variable delay to smooth the load over time" up to extremely security critical crypto applications.
I assume that the random numbers in bash come from an uniform distribution. Is there a way of getting random numbers from a different distribution with given parameters?
I can only think of one-liners in R or maybe python.
The standard approach is to take the CDF form of the desired distribution, solve it for equaling some constant, then plug in your uniform value and out comes where on the desired distribution that maps to. The hard part is you either have to pre-solve it or use numerical solving methods and in either case you'd have to write that in Bash.
for a normal distribution, chatgpt suggests the box-muller transform and calling out to awk (which is a scripting language that continues to surprise me; it seems underrated?)
In testing it out, they did seem normally-distributed but rerunning it quickly did seem to produce more consecutively duplicate results than would be expected, not sure why that would be (I was using ranges of 0-10 and 0-20, perhaps the probability of normally-distributed consecutive dupes in that range is quite higher?)
EDIT: Nope, there's definitely something wrong with this, possibly the rand() call(s)? Can anyone guess? (A real-world example of the perils of relying on chatGPT!) EDIT2: Figured it out. srand with no args seeds to the current second, making any run of this function within the same second produce the same result. Fixing this left as an exercise to the reader EDIT3: I fixed it by seeding it externally via $RANDOM.
"The Box-Muller transform generates two normally distributed random numbers, z0 and z1, using the two uniformly distributed random numbers u1 and u2. However, only z0 is used in the provided function to create the random number within the given range. If you want to additionally supply z1 in the output, you can." It then provides the modified code.
On a related note, I've once had to reproducibly generate a random sequence of N-bytes, and one of the more elegant ways I discovered and ended up using was to 'encrypt' a known sequence of bytes with the required length (so that the sequence will be long enough) with some known 'password', and truncating it.
I’m not sure the best way to make use of it as a source of random decimals — there are surely a dozen ways to do so — but this works:
As always, my advice for things like this is that if you’re in need of true random numbers in your shell script then you should probably stop writing whatever it is you are writing as a shell script.