Abstract
Natural sounds, whether music or conspecific communications, frequently contain multiple amplitude modulation (AM) components. AM, the temporal envelope of sounds, plays a critical role in pitch perception. However, how multiple AM components distribute across tonotopic region of the human cochlea to form pitch percepts remains unclear. To address this, we examine human judgments of multi-tone stimuli with systematically combined amplitude envelopes and carrier frequencies in pitch discrimination, pitch matching, and melodic contour identification tasks. Results reveal that a single amplitude envelope of multi-tones dominates pitch perception, rather than integrating uniformly across the cochlear spectrum as conventionally believed. Specifically, participants accurately discriminate pitch differences when two multi-tones differed only in the temporal envelope modulating the lowest-frequency carrier. In contrast, pitch discrimination accuracy dropped to below chance when the differing temporal envelope modulated higher-frequency carriers. Varying the number of tones or shifting carrier frequencies across different tonotopic regions did not alter AM-envelope-based pitch percepts. Additionally, nonlinear Holo-Hilbert Spectral Analysis confirmed that pitch percepts corresponded to the AM frequency with the highest energy. These findings demonstrate that pitch perception relies heavily on amplitude dynamics determined by cochlear tonotopic position, underscoring the critical interplay between envelope and carriers in processing complex pitch.