如何用spek辨别真假无损 - 数码 | Yuzu Atri = MoMo の Blog = 坚持学习,热爱生活？

# Chinese Ver.

# 什么是无损音乐？

简而言之就是，并没有刻意通过压缩的音乐。

众所周知的 MP3 就是把 CD（磁带、唱片等）音乐文件掐头去尾的方法。也就是说 MP3 主要压缩了音乐的频响范围、采样密度和动态范围这三个主要文件参数，从而达到体积小、传输方便的目的。而无损音乐指的就是无限接近于抓轨采样的直出文件，但它本身还是经过压缩的，只是这个压缩过程中充分保留了客观意义上更原汁原味的声音。

# 在哪里可以听到无损？

Apple Music。
乔布斯是狂热音乐爱好者，为了推广 ipod，曾和各唱片公司有协议，给苹果的都是母带。不过受限于网络，无法上传母带，所以上传的是基于母带压缩后（另一意义上的无损）的音乐。

国内平台（网易云、QQ 音乐、酷狗）起步时音乐来源，多半是经过数人之手不明来源的文件，可能第一次上传的是通过 CD 压缩的文件，但数人转载之后音乐质量不断下降，为了地使文件达到无损标准，会可以渲染音源，加一些无意义的高低频，增大体积。

# 如何辨别真假无损

# 方法一

直接用耳朵听。

最简单粗暴的方法，全损和无损普通人也能听出来。

但是要辨认出 320bit 和 flac、真假 flac，则需一台优良的前端、耳放（功放）、千元级的耳机，仔细辨别音色特点。

# 方法二

看后缀。
如果后缀名是 mp3、aac、ogg、m4a 等，不是无损；
如果是 flac、wav、ape，则四分之三可能是真无损。

看体积。
三分钟左右大于 20mb 则可能是无损，小于就不用考虑了。

# 方法三（正文）

通过项目 spek 辨别
链接：[spek]http://help.spek.cc/
spek 是通过分析音频文件的响频范围将其转化为图像，可以直观形象地辨别真假无损。原理是检测音频的高频信息是否有缺失，如果缺失很严重，就代表该音频可能是假的无损。不过不能 100% 确定，有很多原版的 CD 转出来的无损，就已经有高频损失的现象。只能说唱片制作过程中造成疏忽了 —— 这种现象屡见不鲜。

如果 22khz 以下的部分被整齐地切了一刀，就是假无损；24khz 全满就是真无损

看下图：
假无损

在 16khz 部分明显被整齐地切了一刀，虽然是 flac 后缀，但是假无损。

看下图：

22khz 全满，判定为真无损。

有些网友为了把低品质音乐伪装成无损，会故意添加一些人耳听不到的、无意义的高频，增大音频文件体积，如下：
假无损
大约 55-60khz 处整齐地切一刀，以上的响频响度基本相同，一眼看出人为添加。

# 为什么需要 22khz 以上的响频？

众所周知，人耳的听力范围大约为 20-20000hz 响频的声音，稍微敏感一点的耳朵高频可以达到 22000-24000h。那无损中随处可见的 44khz、192khz 对人耳有什么意义？
说实话，真的极少的发烧友才能听出 44khz 与 192khz 的区别，更多的不过是心理优越感的暗示和烧子的信仰。
心理暗示和信仰真的有奇效。

~~信则有，不信则无。~~

但是还是有规律可循的解释。下面是知友破风神话根据 “奈奎斯特 - 香农采样定理” 的回答：

假如我用 44.1khz 采样率的方式去采样一个 22.05khz 的正弦声波（我们假设人耳能听到 22.05khz 的声波，这个假设其实虽然有点离谱，但还不算太离谱），那么就意味着正弦波的一个周期会被采样两个点。

你（和奈奎斯特）认为的理想情况：这两个点分别都正好对齐采样在正弦波的最大和最小值点上，得到的数字信号在回放的时候再通过一个低通滤波器就能正好完美恢复原先的声波了。

实际的情况：这两个点根本不可能采在声波的最大和最小值点上，这样的回放系统会导致高频信息量的逐步衰减。

实际的最烂情况（还经常发生）：这两个点很不巧正好采样在正弦波的过零点上，高频信息完全丢失。

而用 96khz 的采样率：

理想情况：完全没必要，浪费资源，这群人是煞笔。

实际的情况：平均一个周期被采样了 4 个点多一些，采样在最大和最小值的概率增加了，同时即使没有增加，采到的点距离最大和最小值的误差也减少了，信息量的衰减相比 44.1khz 减少了许多。

实际的最烂情况（还经常发生）：最多有两个点采样在过零点上，反正我还有另外两个点，而且这种情况下另外两个点基本都在声波极值点上，我依然可以恢复信息，再也不会出现那种信息完全丢失的情况了。

懂了吗，“奈奎斯特 - 香农采样定理” 固然指出” 理论上 40KHz 的采样率就够了 “，但是在实际工程生产实践中，增加采样率是可以大幅度降低高频的信息丢失的。

岔开话再说一句，不仅是在音频领域，在通信领域，现在也基本不会有人去做正好满足 “采样频率是频谱中最高频率的 2 倍” 的通信系统，理由就是如我上面所解释的一样。在工程实践中”8 倍过采样 “、”16 倍过采样 “都是常态。

估计有人要和我杠说：” 我 15KHz 以上都听不见了，你还在这里扯 22KHz！？“或者” 高频要那么多信息干什么？“，行，我们再换一个角度去理解采样率：

假如我用 44.1khz 采样率的方式去采样一个 10khz 的正弦声波（这个假设可一点都不离谱），那么平均每个周期的” 声音分辨率 “就是 4.41 个点；

而我如果用 96khz 的采样率，那么平均每个周期的” 声音分辨率 “就是 9.6 个点；

如果用 192khz 的采样率，那么平均每个周期的” 声音分辨率 “就是 19.2 个点；

如果用 384khz 的采样率，那么平均每个周期的” 声音分辨率 “就是 38.4 个点；

而回放还原的时候，就像是你拿着这些点，然后” 描点作图 “，大家小时候肯定都学过数学，你觉得同样是让你描一个正弦波，一个周期只给你稀疏的两个点与一个周期给你密密麻麻 38 个点这两个哪个难度更高？哪个描出来更漂亮？我觉得不用我说了吧。

所以说，高采样率可以使得低频信号的还原变得更精确。我个人认为 Sony 将其命名为”Hi-Res“是非常恰当的，即：” 高解析度音频 “。

# English Ver

This version was translated by using Google Translate,so may contain mistakes.

# What is lossless music?

In short, it is music that is not deliberately compressed.

The well-known MP3 is a method of cutting off the head and tail of CD (tape, record, etc.) music files. In other words, MP3 mainly compresses the three main file parameters of the frequency response range, sampling density and dynamic range of the music, so as to achieve the purpose of small size and convenient transmission. Lossless music refers to a direct-output file that is infinitely close to the sampled track, but it is still compressed, but the compression process fully retains the more original sound in an objective sense.

# Where can I listen to lossless?

Apple Music. Jobs was an avid music lover. In order to promote iPod, he had an agreement with various record companies to give Apple master tapes. However, due to network restrictions, it was impossible to upload master tapes, so the music uploaded was based on the compressed (lossless in another sense) master tapes.

When domestic platforms (NetEase Cloud, QQ Music, Kugou) started, the music sources were mostly files of unknown origin that passed through the hands of several people. The first upload may be files compressed by CDs, but the quality of the music continued to decline after several people reposted it. In order to make the files reach the lossless standard, the sound source can be rendered, some meaningless high and low frequencies can be added, and the volume can be increased.

# How to distinguish true and false lossless

# Method 1 Listen directly with your ears.

The simplest and crudest method, ordinary people can also hear the difference between full loss and lossless.

However, to distinguish 320bit and flac, true and false flac, you need an excellent front end, headphone amplifier (power amplifier), and headphones worth thousands of yuan to carefully distinguish the characteristics of the sound.

# Method 2 Look at the suffix. If the suffix is mp3, aac, ogg, m4a, etc., it is not lossless; If it is flac, wav, ape, then three quarters may be true lossless.

Look at the volume. If it is greater than 20mb for about three minutes, it may be lossless, and if it is less than that, don't consider it.

# Method 3 (main) Identify through the spek project

Link: [spek]http://help.spek.cc/
spek analyzes the frequency range of the audio file and converts it into an image, which can intuitively and vividly distinguish true and false lossless. The principle is to detect whether the high-frequency information of the audio is missing. If the missing is serious, it means that the audio may be false lossless. However, it cannot be 100% certain. Many lossless files converted from original CDs already have high-frequency loss. It can only be said that negligence was caused during the record production process-this phenomenon is common.

If the part below 22khz is neatly cut, it is false lossless; 24khz is full and it is true lossless

See the picture below:
fake lossless

The 16khz part is obviously cut neatly. Although it has the FLAC suffix, it is fake lossless.

See the picture below:
true lossless

22khz is full, which is judged to be true lossless.

In order to disguise low-quality music as lossless, some netizens will deliberately add some meaningless high frequencies that are inaudible to the human ear to increase the size of the audio file, as follows:
fake lossless

A neat cut at about 55-60khz, the above frequency loudness is basically the same, and it can be seen at a glance that it is artificially added.

# Why is a frequency above 22khz needed?

As we all know, the hearing range of the human ear is about 20-20000hz frequency sound, and the high frequency of a slightly more sensitive ear can reach 22000-24000h. What is the significance of 44khz and 192khz that can be seen everywhere in lossless to the human ear? To be honest, very few audiophiles can hear the difference between 44khz and 192khz. Most of them are just hints of psychological superiority and the faith of audiophiles. Psychological hints and faith really work wonders.

But there is still a regular explanation. The following is the answer of Zhiyou 破风神话 based on the "Nyquist-Shannon Sampling Theorem":

If I use a 44.1khz sampling rate to sample a 22.05khz sine wave (we assume that the human ear can hear a 22.05khz sound wave, which is a bit outrageous, but not too outrageous), then it means that one cycle of the sine wave will be sampled at two points.

The ideal situation you (and Nyquist) think: these two points are exactly aligned and sampled at the maximum and minimum points of the sine wave, and the resulting digital signal can be perfectly restored to the original sound wave when it is played back through a low-pass filter.

Actual situation: These two points are impossible to be sampled at the maximum and minimum points of the sound wave. Such a playback system will cause the gradual attenuation of high-frequency information.

The actual worst case (which often happens): These two points are sampled at the zero-crossing point of the sine wave, and the high-frequency information is completely lost.

And using a sampling rate of 96khz:

Ideal situation: It is completely unnecessary, a waste of resources, and these people are stupid.

Actual situation: On average, more than 4 points are sampled per cycle, and the probability of sampling at the maximum and minimum values increases. At the same time, even if it does not increase, the error of the sampled points from the maximum and minimum values is reduced, and the attenuation of information is much less than 44.1khz.

The actual worst case (which often happens): At most two points are sampled at the zero-crossing point. Anyway, I have two other points, and in this case, the other two points are basically at the extreme points of the sound wave. I can still recover the information, and there will never be a situation where the information is completely lost.

Do you understand? The "Nyquist-Shannon Sampling Theorem" points out that "theoretically, a sampling rate of 40KHz is enough", but in actual engineering production practice, increasing the sampling rate can greatly reduce the loss of high-frequency information.

Let me digress and say that not only in the audio field, but also in the communication field, basically no one will make a communication system that just meets the "sampling frequency is twice the highest frequency in the spectrum", and the reason is as I explained above. In engineering practice, "8x oversampling" and "16x oversampling" are the norm.

I guess someone will argue with me: "I can't hear anything above 15KHz, why are you still talking about 22KHz!?" or "Why do high frequencies need so much information?" OK, let's understand the sampling rate from another angle:

If I use a 44.1khz sampling rate to sample a 10khz sine wave (this assumption is not outrageous at all), then the average "sound resolution" per cycle is 4.41 points;

If I use a 96khz sampling rate, then the average "sound resolution" per cycle is 9 .6 points;

If the sampling rate is 192khz, the average "sound resolution" per cycle is 19.2 points;

If the sampling rate is 384khz, the average "sound resolution" per cycle is 38.4 points;

When playing back and restoring, it is like you are holding these points and "drawing points and drawing pictures". Everyone must have learned mathematics when they were young. Do you think it is more difficult to draw a sine wave with only two sparse points in one cycle and 38 dense points in one cycle? Which one is more beautiful? I don't think I need to say it.

Therefore, high sampling rate can make the restoration of low-frequency signals more accurate. I personally think that Sony's name "Hi-Res" is very appropriate, that is, "high-resolution audio".

数码音乐