Feature learning and deep architectures: new directions for music informatics

Humphrey, Eric J.; Bello, Juan P.; LeCun, Yann
December 2013
Journal of Intelligent Information Systems;Dec2013, Vol. 41 Issue 3, p461
Academic Journal
As we look to advance the state of the art in content-based music informatics, there is a general sense that progress is decelerating throughout the field. On closer inspection, performance trajectories across several applications reveal that this is indeed the case, raising some difficult questions for the discipline: why are we slowing down, and what can we do about it? Here, we strive to address both of these concerns. First, we critically review the standard approach to music signal analysis and identify three specific deficiencies to current methods: hand-crafted feature design is sub-optimal and unsustainable, the power of shallow architectures is fundamentally limited, and short-time analysis cannot encode musically meaningful structure. Acknowledging breakthroughs in other perceptual AI domains, we offer that deep learning holds the potential to overcome each of these obstacles. Through conceptual arguments for feature learning and deeper processing architectures, we demonstrate how deep processing models are more powerful extensions of current methods, and why now is the time for this paradigm shift. Finally, we conclude with a discussion of current challenges and the potential impact to further motivate an exploration of this promising research area.


Related Articles

  • Learning-Based Spectrum Sensing for Cognitive Radio Systems. Hassan, Yasmin; El-Tarhuni, Mohamed; Assaleh, Khaled // Journal of Computer Networks & Communications;2012, p1 

    This paper presents a novel pattern recognition approach to spectrum sensing in collaborative cognitive radio systems. In the proposed scheme, discriminative features from the received signal are extracted at each node and used by a classifier at a central node to make a global decision about...

  • New Learning Methods for Supervised and Unsupervised Preference Aggregation. Volkovs, Maksims N.; Zemel, Richard S. // Journal of Machine Learning Research;Mar2014, Vol. 15 Issue 3, p1135 

    In this paper we present a general treatment of the preference aggregation problem, in which multiple preferences over objects must be combined into a single consensus ranking. We consider two instances of this problem: unsupervised aggregation where no information about a target ranking is...

  • Exploiting spectro-temporal locality in deep learning based acoustic event detection. Espi, Miquel; Fujimoto, Masakiyo; Kinoshita, Keisuke; Nakatani, Tomohiro // EURASIP Journal on Audio Speech & Music Processing;9/15/2015, Vol. 2015 Issue 1, p1 

    In recent years, deep learning has not only permeated the computer vision and speech recognition research fields but also fields such as acoustic event detection (AED). One of the aims of AED is to detect and classify non-speech acoustic events occurring in conversation scenes including those...

  • Adaptive Face Gender Recognition Based on NSGA-II and Compressive Sensing. Lin Zhang // Applied Mechanics & Materials;2015, Vol. 738-739, p586 

    In the face gender recognition system, in addition to want to obtain high recognition rate, but also want to be able to achieve recognition quickly which relates to multi-objective optimization problem. In this paper, we proposed an adaptive face gender recognition method based on NSGA-II which...

  • Research on Adaptive Face Gender Recognition Based on Compressive Sensing. Zhang Lin // Advanced Materials Research;7/24/2014, Vol. 989-994, p4187 

    An adaptive gender recognition method is proposed in this paper. At first, do multiwavlet transform to face image and get its low frequency information, then do feature extraction to the low frequency information using compressive sensing (CS), use extreme learning machine (ELM) to achieve...

  • Refinement of Hyperspectral Image Classification with Segment-Tree Filtering. Lu Li; Chengyi Wang; Jingbo Chen; Jianglin Ma // Remote Sensing;Jan2017, Vol. 9 Issue 1, p1 

    This paper proposes a novel method of segment-tree filtering to improve the classification accuracy of hyperspectral image (HSI). Segment-tree filtering is a versatile method that incorporates spatial information and has been widely applied in image preprocessing. However, to use this powerful...

  • R-peaks Detection using Local Mean Decomposition. Chen Diao; Aihua Zhang; Bin Wang; Caixia Wang // International Journal of Advancements in Computing Technology;May2013, Vol. 5 Issue 9, p1103 

    Electrocardiogram (ECG) signal is the expression of myocardium electrical activity on the body surface, which appears as a nearly periodic signal. ECG contains a significant amount of information about heart disease. The analyses of the ECG largely rely on the accuracy of the R-peak detector. In...

  • Feature correspondence in a non-overlapping camera network. Xiang, Zong; Chen, Qiren; Liu, Yuncai // Multimedia Tools & Applications;Dec2014, Vol. 73 Issue 3, p1129 

    Person re-identification across multiple cameras is difficult due to viewpoint and illumination variations. Most traditional research focuses on developing invariant features that are unaffected by these variations. However, thus far, there has been no feature developed that is completely...

  • Analysis of Palm Vein Pattern Recognition Algorithms and Systems. Ahmed, Mona A.; Ebied, Hala M.; El-Horbaty, El-Sayed M.; Salem, Abdel-Badeeh M. // International Conference on Intelligent Computing & Information ;2013, p275 

    Palm vein authentication has high level of accuracy because it is located inside the body and does not change over the life and cannot be stolen. This paper presents an analysis of palm vein pattern recognition algorithms, techniques, methodologies and systems. It discusses the technical aspects...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics