Loading...
Please wait, while we are loading the content...
A Method and An Apparatus and A Computer Program Product for Video Encoding and Decoding
| Content Provider | The Lens |
|---|---|
| Related Links | https://www.lens.org/lens/patent/011-578-819-367-848/frontpage |
| Language | English |
| Publisher Date | 2019-08-21 |
| Access Restriction | Open |
| Alternative Title | Methode, Vorrichtung Und Computerprogramm Für Videokodierung Und Dekodierung Procédé, Appareil Et Programme D'ordinateur Pour Le Codage Et Le Décodage Vidéo |
| Content Type | Text |
| Resource Type | Patent |
| Date Applied | 2017-11-28 |
| Agent | Nokia Epo Representatives |
| Applicant | Nokia Technologies Oy |
| Application No. | 17203955 |
| Claim | A method of reducing data rate for transmitting virtual reality content, comprising: obtaining a picture sequence; receiving viewport information, wherein the viewport information comprises first viewport parameters of a prevailing viewport and/or second viewport parameters of one or more expected viewports; and selecting a first spatial region (912) based on the received viewport information and a second spatial region (913) within a picture area of pictures of the picture sequence, the second spatial region (913) differing from the first spatial region (912); obtaining a first spatial region sequence, the first spatial region sequence comprising the first spatial region (912) of the pictures of the picture sequence; obtaining a second spatial region sequence, the second spatial region sequence comprising the second spatial region (913) of the pictures of the picture sequence; forming a first sub-sequence of the first spatial region sequence at a second picture rate, wherein the pictures of the first sub-sequence are temporally aligned with the pictures of the second spatial region sequence; forming a second sub-sequence of the first spatial region sequence comprising all pictures not in the first sub-sequence; transmitting the first spatial region sequence at a first picture rate; transmitting the second spatial region sequence at the second picture rate, the first picture rate being different from the second picture rate. A method of claim 1, wherein the first picture rate is greater than the second picture rate, wherein the method further comprises: transmitting the first sub-sequence over a first transmission channel (1101); transmitting the second sub-sequence over a second transmission channel (1102), the second transmission channel (1102) differing from the first transmission channel (1101); transmitting the second spatial region sequence over the first transmission channel (1101) or a third transmission channel (1203), the third transmission channel (1203) differing from the second transmission (1102) channe A method according to claim 1, wherein the picture sequence is obtained by video encoding, and wherein the video encoding comprises: encoding a first bitstream comprising the first spatial region sequence at the first picture rate; and encoding a second bitstream comprising the second spatial region sequence at the second picture rate, the second bitstream being decodable independently of the first bitstream. A method according to claim 3, wherein the video encoding comprises: encoding the first spatial region (912) and the second spatial region (913) as a first single picture when they are temporally aligned; encoding the first spatial region (912) and blocks marked as non-coded as a second single picture when no second spatial region is temporally aligned with the first spatial region (913). A method according to claim 3, wherein the video encoding comprises: encoding the first spatial region sequence as a first scalable layer of a bitstream; encoding the second spatial region sequence as a second scalable layer of the bitstream. A method according to claim 1 or any of claims 3 to 5, wherein the first picture rate is less than the second picture rate, the method further comprising: receiving gaze position information; selecting the second spatial region (913) as a fovea region based on the received gaze position information, the fovea region being a subset of the first spatial region (912); encoding the first spatial region sequence at a first sampling density, a first picture quality, a first bit-depth, a first dynamic range, and a first color gamut; encoding the second spatial region sequence at a second sampling density, a second picture quality, a second bit-depth, a second dynamic range, and a second color gamut, wherein at least one of the second sampling density, the second picture quality, the second bit-depth, the second dynamic range, and the second color gamut is greater than the first sampling density, the first picture quality, the first bit-depth, the first dynamic range, and the first color gamut. A method according to claim 1 further comprising: receiving the first spatial region sequence at the first picture rate; receiving a received second spatial region sequence at the first picture rate; selecting a temporal subset at the second picture rate of the received second spatial region sequence; and transmitting the temporal subset as the second spatial region sequence at the second picture rate. A method of receiving virtual reality content at reduced data rate comprising: transmitting viewport information, wherein the viewport information comprises one or both of the following: first viewport parameters of a prevailing viewport; second viewport parameters of one or more expected viewports; decoding a first spatial region sequence at a first picture rate, wherein a first spatial region (912) is based on the transmitted viewport information; decoding a second spatial region sequence at a second picture rate, wherein the second spatial region (913) is different from the first spatial region (912) and the first picture rate is greater than the second picture rate, wherein the first spatial region sequence comprises a first sub-sequence at the second picture rate and a second sub-sequence comprising all pictures of the first spatial region sequence not in the first sub-sequence, and wherein pictures of the first sub-sequence are temporally aligned with pictures of the second spatial region sequence; obtaining viewport parameters of a viewport; in response to the first spatial region (912) covering the viewport, displaying at least a first subset of the decoded first spatial region sequence; and in response to the first spatial region not covering the viewport, forming a combination of the decoded first spatial region sequence and the second spatial region sequence, and displaying at least a second subset of said combination. A method according to claim 8, wherein the forming of the combination comprises decreasing a picture rate of the first spatial region sequence to be the same as the second picture rate, or increasing a picture rate of the second spatial region sequence to be the same as the first picture rate. A method according to claim 8 or 9, wherein the forming of the combination comprises: decreasing a picture rate of the first spatial region sequence to be a third picture rate; and increasing a picture rate of the second spatial region sequence to be the third picture rate. A method according to any of the preceding claims 8 to 10, wherein the decoding comprises: decoding a first bitstream comprising the first spatial region sequence at the first picture rate; and decoding a second bitstream comprising the second spatial region sequence at the second picture rate. A method according to any of the preceding claims 8 to 11, wherein the decoding comprises: decoding the first spatial region (912) and the second spatial region (913) as a first single picture when they are temporally aligned; decoding the first spatial region (913) and blocks marked as non-coded as a second single picture when no second spatial region is temporally aligned with the first spatial region. A method according any of the preceding claims 8 to 12, wherein the decoding comprises: decoding the first spatial region sequence from a first scalable layer of a bitstream; decoding the second spatial region sequence from a second scalable layer of the bitstream. An apparatus comprising at least one processor, memory including computer program code, the memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform a method according any of the preceding claims 1 to 13. |
| CPC Classification | PICTORIAL COMMUNICATION; e.g. TELEVISION IMAGE DATA PROCESSING OR GENERATION; IN GENERAL |
| Extended Family | 031-130-558-214-244 014-036-903-726-244 045-110-724-440-221 144-948-326-167-862 011-578-819-367-848 162-015-698-669-001 |
| Patent ID | 3334164 |
| Inventor/Author | Hannuksela Miska Talmola Pekka Salmimaa Marja |
| IPC | H04N21/4728 H04N13/111 H04N19/597 H04N21/2343 H04N21/63 H04N21/6587 H04N21/845 |
| Status | Active |
| Owner | Nokia Technologies Oy |
| Simple Family | 031-130-558-214-244 011-578-819-367-848 045-110-724-440-221 014-036-903-726-244 144-948-326-167-862 162-015-698-669-001 |
| CPC (with Group) | H04N21/4728 G06T2207/20104 H04N13/111 H04N13/207 H04N13/25 H04N13/344 H04N19/597 H04N21/23439 H04N21/631 H04N21/6587 H04N21/8456 H04N19/115 H04N19/162 H04N19/167 H04N19/17 H04N19/187 |
| Issuing Authority | United States Patent and Trademark Office (USPTO) |
| Kind | Patent/Patent 1st level of publication/Inventor's certificate |