mysql编码不要使用uft8

MySQL中的“utf8”编码不是”真正的utf8“,它只支持最大3字节每字符。这种编码很多Unicode字符保存不了,比如移动端的表情符,真正的大家正在使用的UTF-8编码是应该能支持4字节每个字符。

MySQL中的utf8mb4才是 真正意义上的“UTF-8”。

对应的Collation使用utf8mb4_unicode_ci还是utf8mb4_general_ci
推荐utf8mb4_unicode_ci
utf8mb4_unicode_ci基于标准的Unicode来排序和比较,能够在各种语言之间精确排序,性能比没有实现Unicode排序规则的utf8mb4_general_ci差,但是对于现在的cpu来说,这点差距不足为惧,

Note: Since MySQL 5.5.3 you should use utf8mb4 rather than utf8. They both refer to the UTF-8 encoding, but the older utf8 had a MySQL-specific limitation preventing use of characters numbered above 0xFFFD.

Accuracy

utf8mb4_unicode_ci is based on the Unicode standard for sorting and comparison, which sorts accurately in a very wide range of languages.

utf8mb4_general_ci fails to implement all of the Unicode sorting rules, which will result in undesirable sorting in some situations, such as when using particular languages or characters.

Performance

utf8mb4_general_ci is faster at comparisons and sorting, because it takes a bunch of performance-related shortcuts.

On modern servers, this performance boost will be all but negligible. It was devised in a time when servers had a tiny fraction of the CPU performance of today’s computers.

utf8mb4_unicode_ci, which uses the Unicode rules for sorting and comparison, employs a fairly complex algorithm for correct sorting in a wide range of languages and when using a wide range of special characters. These rules need to take into account language-specific conventions; not everybody sorts their characters in what we would call ‘alphabetical order’.
(摘自stackoverflow)