加密解密基础问题：字节数组和(16进制)字符串的相互转换

2024-06-17 18:25| 来源: 网络整理| 查看: 265

在加密时，一般加密算法和hash算法，它们操作的都是字节数组，对字节数组按照加密算法进行各种变换，运算，得到的结果也是字节数组。而我们一般是要求对字符串进行加密，所以就涉及到字符串String到 byte[] 的转换，这个很简单。同时在解密时，也涉及到字节数组byte[] 到 String 的转换。另外在对用户的密码进行hash加密之后，最终是要保存在数据库中，所以加密得到 byte[] 也要转换到 String.

1. String 到 byte[] 的转换很简单，因为String类有直接的函数：

public byte[] getBytes(Charset charset) { if (charset == null) throw new NullPointerException(); return StringCoding.encode(charset, value, 0, value.length); } /** * Encodes this {@code String} into a sequence of bytes using the * platform's default charset, storing the result into a new byte array. * * @return The resultant byte array * * @since JDK1.1 */ public byte[] getBytes() { return StringCoding.encode(value, 0, value.length); }

2. 但是，byte[] 到String 的转换却没有那么简单

其原因是，我们不能简单的使用使用String的函数：

/** * Constructs a new {@code String} by decoding the specified array of bytes * using the platform's default charset. The length of the new {@code * String} is a function of the charset, and hence may not be equal to the * length of the byte array. * *

The behavior of this constructor when the given bytes are not valid * in the default charset is unspecified. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required.*/ public String(byte bytes[]) { this(bytes, 0, bytes.length); } /** * Constructs a new {@code String} by decoding the specified array of * bytes using the specified {@linkplain java.nio.charset.Charset charset}. * The length of the new {@code String} is a function of the charset, and * hence may not be equal to the length of the byte array. * *

This method always replaces malformed-input and unmappable-character * sequences with this charset's default replacement string. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required.*/ public String(byte bytes[], Charset charset) { this(bytes, 0, bytes.length, charset); }

也就是不能使用 new String(byte); 也不能使用 new String(byte, charset).

为什么呢？

很简单因为， MD5, SHA-256, SHA-512 等等算法，它们是通过对byte[] 进行各种变换和运算，得到加密之后的byte[]，那么这个加密之后的 byte[] 结果显然就不会符合任何一种的编码方案，比如 utf-8, GBK等，因为加密的过程是任意对byte[]进行运算的。所以你用任何一种编码方案来解码加密之后的 byte[] 结果，得到的都会是乱码。

那么，我们该如何将加密的结果 byte[] 转换到String呢？

首先，我们要问一下，为什么要将加密得到的 byte[] 转换到 String ？

答案是因为一是要对加密的结果进行存储，比如存入数据库中，二是在单向不可逆的hash加密算法对密码加密时，我们需要判断用户登录的密码是否正确，那么就涉及到两个加密之后的byte[] 进行比较，看他们是否一致。两个 byte[] 进行比较，可以一次比较一个单字节，也可以一次比较多个字节。也可以转换成String, 然后比较两个String就行了。因为加密结果要进行存储，所以其实都是选择转换成String来进行比较的。

加密解密时，采用的byte[] 到 String 转换的方法都是将 byte[] 二进制利用16进制的char[]来表示，每一个 byte 是8个bit，每4个bit对应一个16进制字符。所以一个 byte 对应于两个 16进制字符：

public class HexUtil { private static final char[] DIGITS = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' }; public static String encodeToString(byte[] bytes) { char[] encodedChars = encode(bytes); return new String(encodedChars); } public static char[] encode(byte[] data) { int l = data.length; char[] out = new char[l >> 4]; out[j++] = DIGITS[0x0F & data[i]]; } return out; }

我们知道16进制表达方式是使用 0-9 abcdef 这16个数字和字母来表示 0-15 这16个数字的。而显然我们在String转化时，可以用字符 '0' 来表示数字0, 可以用 '1' 来表示 1，可以用 'f' 来表示15.

所以上面我们看到16进制使用 "0-9abcdef' 16个字符来表示 0-15 这个16个数字。主要的转换过程是 public static char[] encode(byte[] data)函数：

int l = data.length; char[] out = new char[l >> 4 表示先使用0xF0 & data[i], 去除了低4位上的值(其实这一步是多余的)，然后右移4位，得到byte[] 数组中第 i 个 byte 的高 4位，然后通过 DIGITS[] 数组，得到高4为对应的字符；

DIGITS[0x0F & data[i]] 表示先使用 0x0F & data[i], 去除了高4位上的值，也就得到了低4为代表的大小，然后通过 DIGITS[] 数组，得到低4为对应的字符；

通过这种方式，就可以将 byte[] 数组转换成16进制字符表示的 char[]。最后 new String(encodedChars); 得到String类型的结果.

所以最后的String是由：'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' 这16个字符组成的String, 不含有任何其的字母。比如不会g,h,jklmn.....等等。

3. 反向转换：String 到 byte[]

上面我们实现了 byte[] 到 String 的转换，编码方案使用的是16进制编码。那么如何进行反向解码呢？也就是将16进制编码的String转换成原来的byte[]呢？

/** * Converts the specified Hex-encoded String into a raw byte array. This is a * convenience method that merely delegates to {@link #decode(char[])} using the * argument's hex.toCharArray() value. * * @param hex a Hex-encoded String. * @return A byte array containing binary data decoded from the supplied String's char array. */ public static byte[] decode(String hex) { return decode(hex.toCharArray()); } /** * Converts an array of characters representing hexidecimal values into an * array of bytes of those same values. The returned array will be half the * length of the passed array, as it takes two characters to represent any * given byte. An exception is thrown if the passed char array has an odd * number of elements. * * @param data An array of characters containing hexidecimal digits * @return A byte array containing binary data decoded from * the supplied char array. * @throws IllegalArgumentException if an odd number or illegal of characters * is supplied */ public static byte[] decode(char[] data) throws IllegalArgumentException { int len = data.length; if ((len & 0x01) != 0) { throw new IllegalArgumentException("Odd number of characters."); } byte[] out = new byte[len >> 1]; // two characters form the hex value. for (int i = 0, j = 0; j < len; i++) { int f = toDigit(data[j], j) > 1]; byte[] 结果是 char[] 大小的一半大。

toDigit(data[j], j) >>. Hello 世界！ str2=???hello/sasewredfdd>>>. Hello 世界！ true sha=37a9715fecb5e2f9812d4a02570636e3d5fe476fc67ac34bc824d6a8f835635d

最后的 new SimpleHash("sha-256", str, "11d23ccf28fc1e8cbab8fea97f101fc1d", 2).toString() ，其 .toString() 方法就是使用的 16进制的编码将hash加密之后的 byte[] 转换成 16进制的字符串。

我们看得到的结果：37a9715fecb5e2f9812d4a02570636e3d5fe476fc67ac34bc824d6a8f835635d

全部由'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' 这16个字符组成。不含其他任何字符。

上面我们也可以使用Base64的编码方案：

Base64.encodeToString(str.getBytes())

它其实是使用 a-z, A-Z, 0-9, /， + 这64个字符来进行编码的，0-63分别对应用前面的64个字符来表示。

其编码结果的特点是：末尾可能有1个或者2个 = :

Pz8/aGVsbG8vc2FzZXdyZWRmZGQ+Pj4uIEhlbGxvIOS4lueVjO+8gQ==

其原因是，Base64编码算法是每次处理byte[]数组中三个连续的byte，那么就有可能 byte[] 数组不是3的整数倍，那么余数就有可能是1，或者2，所以就分别使用一个 = 和两个 = 来进行填充。

所以：

Base64的编码其特点就是可能末尾有一个或者两个=，可能含有 / 和 + 字符。

16进制编码的特点是全部由'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' 这16个字符组成，不含其他字母。

加密算法都是对byte[]进行变换和运算。

有 String 转换得到的 byte[] 就一定可以使用原来的编码方案转换成原来的 String,

但是加密的结果 byte[] 却不能用任何字符编码方案得到String, 一般使用16进制编码成String，然后进行存储或者比较。

【本文地址】

公司简介

联系我们