Loading...
Please wait, while we are loading the content...
Similar Documents
Multi-channel Audio Processing
| Content Provider | The Lens |
|---|---|
| Abstract | A method including: receiving at least a first input audio channel and a second input audio channel; and using an inter-channel prediction model to form at least an inter-channel direction of reception parameter. |
| Related Links | https://www.lens.org/lens/patent/012-657-966-114-359/frontpage |
| Language | English |
| Publisher Date | 2017-02-28 |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Patent |
| Jurisdiction | United States of America |
| Date Applied | 2009-12-16 |
| Agent | Harrington & Smith |
| Applicant | Ojala Pasi Nokia Technologies Oy |
| Application No. | 200913516362 |
| Claim | A method comprising: receiving a first input audio channel and a second input audio channel that jointly represent a spatial audio image; determining a first metric as a prediction gain of an inter-channel prediction model that predicts the first input audio channel based at least in part on the second audio input channel, wherein the prediction model is one of an autoregressive model, a moving average model, and an autoregressive moving average model and a second metric as a prediction gain of an inter-channel prediction model that predicts the second input audio channel based at least in part on the first audio input channel, wherein the prediction model is one of an autoregressive model, a moving average model, and an autoregressive moving average model, wherein determining the first metric comprises computing the respective prediction gain as the ratio between energy of the predicted first input audio channel and the energy of a prediction error signal determined as the difference between the first input audio channel and the predicted first input audio channel, and wherein determining the second metric comprises computing the respective prediction gain as the ratio between energy of the predicted second input audio channel and the energy of a prediction error signal determined as the difference between the second input audio channel and the predicted second input audio channel; computing a comparison value that compares the first metric and the second metric; and computing at least one inter-channel direction of reception parameter based on the comparison value. A method as claimed in claim 1 , further comprising providing an output signal comprising a downmixed signal and the at least one inter-channel direction of reception parameter. A method as claimed in claim 1 , further comprising: using the first metric as an operand of a slowly varying function to obtain a modified first metric; using the second metric as an operand of the same slowly varying function to obtain a modified second metric; determining as the comparison value, a difference between the modified first metric and the modified second metric. A method as claimed in claim 3 , wherein the comparison value is a difference between a logarithm of the first metric and the logarithm of the second metric. A method as claimed in claim 1 , further comprising: mapping the inter-channel direction of reception parameter to the comparison value using a mapping function calibrated from the obtained comparison value and an associated inter-channel direction of reception parameter. A method as claimed in claim 5 , wherein the associated inter-channel direction of reception parameter is determined using at least one of an absolute inter-channel time difference parameter and an absolute inter-channel level difference parameter. A method as claimed in claim 5 , further comprising recalibrating the mapping function intermittently. A method as claimed in claim 5 , wherein the mapping function is a function of time and sub band and is determined using available obtained comparison values and associated inter-channel direction of reception parameters. A method as claimed in claim 1 , wherein the inter-channel prediction model represents a predicted sample of an audio channel in terms of a different audio channe A method as claimed in claim 9 , further comprising minimizing a cost function for the predicted sample to determine a inter-channel prediction model and using the determined inter-channel prediction model to determine at least one inter-channel parameter. A method as claimed in claim 1 , further comprising segmenting at least the first input audio channel and second input audio channel in the time slots in the time domain and sub bands in the frequency domain and using an inter-channel prediction model to form an inter-channel direction of reception parameter for each of a plurality of sub bands. A method as claimed in claim 1 further comprising using at least one selection criterion for selecting an inter-channel prediction model for use, wherein the at least one selection criterion is based upon a performance measure of the inter-channel prediction mode A method as claimed in claim 12 , wherein the performance measure is prediction gain. A method as claimed in claim 1 comprising selecting an inter-channel prediction model for use from a plurality of inter-channel prediction models. A non-transitory computer readable medium storing a program of instructions, execution of which by at least on processor configures an apparatus to perform the method of claim 1 . A non-transitory computer readable medium storing a program of instructions, execution of which by at least on processor configures an apparatus to at least: receive a first input audio channel and a second input audio channel that jointly represent a spatial audio image; determine a first metric as a prediction gain of an inter-channel prediction model that predicts the first input audio channel based at least in part on the second audio input channel, wherein the prediction model is one of an autoregressive model, a moving average model, and an autoregressive moving average model, and a second metric as a prediction gain of an inter-channel prediction model that predicts the second input audio channel based at least in part on the first audio input channel, wherein the prediction model is one of an autoregressive model, a moving average model, and an autoregressive moving average model, wherein determining the first metric comprises computing the respective prediction gain as the ratio between energy of the predicted first input audio channel and the energy of a prediction error signal determined as the difference between the first input audio channel and the predicted first input audio channel, and wherein determining the second metric comprises computing the respective prediction gain as the ratio between energy of the predicted second input audio channel and the energy of a prediction error signal determined as the difference between the second input audio channel and the predicted second input audio channel; compute a comparison value that compares the first metric and the second metric; and compute at least one inter-channel direction of reception parameter based on the comparison value. A non-transitory computer readable medium as claimed in claim 16 , wherein the apparatus is further configured to: use the first metric as an operand of a slowly varying function to obtain a modified first metric; use the second metric as an operand of the same slowly varying function to obtain a modified second metric; and determine as the comparison value, a difference between the modified first metric and the modified second metric. A non-transitory computer readable medium as claimed in claim 16 , wherein the comparison value is a difference between a logarithm of the first metric and the logarithm of the second metric. An apparatus comprising: at least one processor; memory storing a program of instructions; wherein the memory storing the program of instructions is configured to, with the at least one processor, cause the apparatus to at least: receive a first input audio channel and a second input audio channel that jointly represent a spatial audio image; determine a first metric as a prediction gain of an inter-channel prediction model that predicts the first input audio channel based at least in part on the second audio input channel, wherein the prediction model is one of an autoregressive model, a moving average model, and an autoregressive moving average model, and a second metric as a prediction gain of an inter-channel prediction model that predicts the second input audio channel based at least in part on the first audio input channel, wherein the prediction model is one of an autoregressive model, a moving average model, and an autoregressive moving average model, wherein determining the first metric comprises computing the respective prediction gain as the ratio between energy of the predicted first input audio channel and the energy of a prediction error signal determined as the difference between the first input audio channel and the predicted first input audio channel, and wherein determining the second metric comprises computing the respective prediction gain as the ratio between energy of the predicted second input audio channel and the energy of a prediction error signal determined as the difference between the second input audio channel and the predicted second input audio channel; compute a comparison value that compares the first metric and the second metric; and compute at least one inter-channel direction of reception parameter. An apparatus as claimed in claim 19 , wherein the apparatus is further caused to: use the first metric as an operand of a slowly varying function to obtain a modified first metric; use the second metric as an operand of the same slowly varying function to obtain a modified second metric; and use as the comparison value, a difference between the modified first metric and the modified second metric. A method comprising: receiving at least one inter-channel direction of reception parameter, wherein the at least one inter-channel direction of reception parameter is computed based on a comparison value, wherein the comparison value is computed as a comparison of a first metric and a second metric that jointly represent a spatial audio image, wherein the first metric is determined as prediction gain of an inter-channel prediction model that predicts a first audio input channel based at least on a second audio input channel, wherein the prediction model is one of an autoregressive model, a moving average model, and an autoregressive moving average model, and the second metric is determined as a prediction gain of an inter-channel prediction model that predicts a second input audio channel based at least on a first audio input channel, wherein the prediction model is one of an autoregressive model, a moving average model, and an autoregressive moving average model, wherein determining the first metric comprises computing the respective prediction gain as the ratio between energy of the predicted first input audio channel and the energy of a prediction error signal determined as the difference between the first input audio channel and the predicted first input audio channel, and wherein determining the second metric comprises computing the respective prediction gain as the ratio between energy of the predicted second input audio channel and the energy of a prediction error signal determined as the difference between the second input audio channel and the predicted second input audio channel; and using a downmixed signal and the at least one inter-channel direction of reception parameter to render multi-channel audio output. A method as claimed in claim 21 further comprising: converting the at least one inter-channel direction of reception parameter to an inter-channel time difference before rendering the multi-channel audio output. A method as claimed in claim 21 further comprising: converting the at least one inter-channel direction of reception parameter to level values using a panning law. |
| CPC Classification | SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS;SPEECH RECOGNITION;SPEECH OR VOICE PROCESSING TECHNIQUES;SPEECH OR AUDIO CODING OR DECODING STEREOPHONIC SYSTEMS BROADCAST COMMUNICATION |
| Examiner | Duc Nguyen Assad Mohammed |
| Extended Family | 012-657-966-114-359 074-308-223-768-603 026-206-048-699-453 135-298-450-020-448 135-017-798-236-422 017-989-884-956-686 095-290-834-438-91X 116-923-539-861-552 031-174-930-174-184 027-103-129-649-92X 120-490-109-283-178 |
| Patent ID | 9584235 |
| Inventor/Author | Ojala Pasi |
| IPC | H04H20/47 G10L19/008 G10L21/02 G10L21/0216 H04H40/36 H04S3/00 |
| Status | Inactive |
| Owner | Nokia Corporation Nokia Technologies Oy |
| Simple Family | 012-657-966-114-359 074-308-223-768-603 026-206-048-699-453 135-298-450-020-448 135-017-798-236-422 031-174-930-174-184 116-923-539-861-552 027-103-129-649-92X 095-290-834-438-91X 017-989-884-956-686 120-490-109-283-178 |
| CPC (with Group) | G10L19/008 G10L25/12 G10L2021/02166 H04S3/008 H04S2420/03 H04S3/00 G10L19/00 H04H40/36 |
| Issuing Authority | United States Patent and Trademark Office (USPTO) |
| Kind | Patent/New European patent specification (amended specification after opposition procedure) |