To equip our intelligent baby-sitter with the ability to identify the crying sound of a baby, we need to train an appropriate machine learning model. Thus we need to collect enough data for training the model. To improve the performance of the machine learning model, the training data not only includes crying sounds of babies, but also includes different kinds of noises that could show up in the vicinity of the car. We completed this task by both recording data in environment and using YouTube as a supplement.
1. Baby crying sound
Collecting baby crying sound is significant but challenging. It’s hard and kind of embarrassing to ask babies we met in the street to cry for us. But we accomplished this goal by visiting hospitals and recording baby crying sounds there. Also, as a supplementary measure, we searched and recorded baby crying sound on YouTube.
Result:
We have collected 100+ sound clips, each having length of approximately 4 seconds.
2. Noises
Noises are also important data. Since our using scenario is set in NYC which is one of the most crowded place in the world, our intelligent baby-sitter must be capable of distinguishing baby crying sound from any kinds of background noises. These noises include horns of cars, siren of police cars, fire-fighting trucks or ambulances and loud chatting sound. Thus, we need to collect all of these sounds.
We collected those data by ourselves by walking in the street. In more detail, we have collected the following noises:
(a) White noise: when we were walking in crowded areas like Time Square, what we heard was a mingled white noise. We collected those data by walking in squares, sitting in restaurants and a buses/subway.
(b) Alarming sound: This noise is very important. The first reason is that it is extremely common in NYC, the second reason is that the frequency and amplitude of alarms is similar to baby’s crying sound, thus we made a great effort to collect it. We managed to do this by recording siren of passing-by police cars, fire-fighting trucks and ambulances. We also got supplementary data from Youtube.
(c) Horn of cars: we collected this kind of noise in traffic jams.
(d) Chatting sound of people: initially we didn’t think of recording this kind of data. But the thing is that, we believed the machine learning module relies on the “sound energy” to some extent from our test results. Thus the white noise mentioned previously is not enough. For this reason we collected chatting sounds with relatively high acoustic energy.
(e) Random noises: We carried our recording equipment with us all day long, collected various kinds of sounds like sound of shutting car doors, dogs barking, operating vacuum cleaners, etc.
Result:
We have collected 400+ sound clips, each having length of approximately 4 seconds.
1. Baby crying sound
Collecting baby crying sound is significant but challenging. It’s hard and kind of embarrassing to ask babies we met in the street to cry for us. But we accomplished this goal by visiting hospitals and recording baby crying sounds there. Also, as a supplementary measure, we searched and recorded baby crying sound on YouTube.
Result:
We have collected 100+ sound clips, each having length of approximately 4 seconds.
2. Noises
Noises are also important data. Since our using scenario is set in NYC which is one of the most crowded place in the world, our intelligent baby-sitter must be capable of distinguishing baby crying sound from any kinds of background noises. These noises include horns of cars, siren of police cars, fire-fighting trucks or ambulances and loud chatting sound. Thus, we need to collect all of these sounds.
We collected those data by ourselves by walking in the street. In more detail, we have collected the following noises:
(a) White noise: when we were walking in crowded areas like Time Square, what we heard was a mingled white noise. We collected those data by walking in squares, sitting in restaurants and a buses/subway.
(b) Alarming sound: This noise is very important. The first reason is that it is extremely common in NYC, the second reason is that the frequency and amplitude of alarms is similar to baby’s crying sound, thus we made a great effort to collect it. We managed to do this by recording siren of passing-by police cars, fire-fighting trucks and ambulances. We also got supplementary data from Youtube.
(c) Horn of cars: we collected this kind of noise in traffic jams.
(d) Chatting sound of people: initially we didn’t think of recording this kind of data. But the thing is that, we believed the machine learning module relies on the “sound energy” to some extent from our test results. Thus the white noise mentioned previously is not enough. For this reason we collected chatting sounds with relatively high acoustic energy.
(e) Random noises: We carried our recording equipment with us all day long, collected various kinds of sounds like sound of shutting car doors, dogs barking, operating vacuum cleaners, etc.
Result:
We have collected 400+ sound clips, each having length of approximately 4 seconds.