前言本文主要记录python下音频常用的操作,以.wav格式文件为例。其实网上有很多现成的音频工具包,如果仅仅调用,工具包是更方便的。
更多pyton下的操作可以参考:
用python做科学计算1、批量读取.wav文件名:123456
import
os
filepath
=
"./data/"
filename
=
os.listdir(filepath)
for
file
in
filename:
print
(filepath
+
file
)
这里用到字符串路径:
1.通常意义字符串(str)
2.原始字符串,以大写R 或 小写r开始,r'',不对特殊字符进行转义
3.Unicode字符串,u'' basestring子类
如:123
path
=
'./file/n'
path
=
r
'.file
'
path
=
'.\file\n'
三者等价,右划线为转义字符,引号前加r表示原始字符串,而不转义(r:raw string).常用获取帮助的方式:
>>> help(str)
>>> dir(str)
>>> help(str.replace)
2、读取.wav文件wave.open 用法:1
wave.
open
(
file
,mode)
mode可以是:‘rb’,读取文件;‘wb’,写入文件;不支持同时读/写操作。Wave_read.getparams用法:123
f
=
wave.
open
(
file
,
'rb'
)
params
=
f.getparams()
nchannels, sampwidth, framerate, nframes
=
params[:
4
]
其中最后一行为常用的音频参数:nchannels:声道数sampwidth:量化位数(byte)framerate:采样频率nframes:采样点数
对应code:1234567891011121314151617181920
import
wave
import
matplotlib.pyplot as plt
import
numpy as np
import
os
filepath
=
"./data/"
filename
=
os.listdir(filepath)
f
=
wave.
open
(filepath
+
filename[
1
],
'rb'
)
params
=
f.getparams()
nchannels, sampwidth, framerate, nframes
=
params[:
4
]
strData
=
f.readframes(nframes)
waveData
=
np.fromstring(strData,dtype
=
np.int16)
waveData
=
waveData
*
1.0
/
(
max
(
abs
(waveData)))
time
=
np.arange(
0
,nframes)
*
(
1.0
/
framerate)
plt.plot(time,waveData)
plt.xlabel(
"Time(s)"
)
plt.ylabel(
"Amplitude"
)
plt.title(
"Single channel wavedata"
)
plt.grid(
'on'
)
结果图:
这里通道数为3,主要借助np.reshape一下,其他同单通道处理完全一致,对应code:1234567891011121314151617181920212223242526272829303132333435363738394041424344
import
wave
import
matplotlib.pyplot as plt
import
numpy as np
import
os
filepath
=
"./data/"
filename
=
os.listdir(filepath)
f
=
wave.
open
(filepath
+
filename[
0
],
'rb'
)
params
=
f.getparams()
nchannels, sampwidth, framerate, nframes
=
params[:
4
]
strData
=
f.readframes(nframes)
waveData
=
np.fromstring(strData,dtype
=
np.int16)
waveData
=
waveData
*
1.0
/
(
max
(
abs
(waveData)))
waveData