Spaces:
Runtime error
Runtime error
/* | |
* Copyright (c) 2015 Manojkumar Bhosale ([email protected]) | |
* | |
* This file is part of FFmpeg. | |
* | |
* FFmpeg is free software; you can redistribute it and/or | |
* modify it under the terms of the GNU Lesser General Public | |
* License as published by the Free Software Foundation; either | |
* version 2.1 of the License, or (at your option) any later version. | |
* | |
* FFmpeg is distributed in the hope that it will be useful, | |
* but WITHOUT ANY WARRANTY; without even the implied warranty of | |
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU | |
* Lesser General Public License for more details. | |
* | |
* You should have received a copy of the GNU Lesser General Public | |
* License along with FFmpeg; if not, write to the Free Software | |
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA | |
*/ | |
/* Description : Load 4 words with stride | |
Arguments : Inputs - psrc (source pointer to load from) | |
- stride | |
Outputs - out0, out1, out2, out3 | |
Details : Loads word in 'out0' from (psrc) | |
Loads word in 'out1' from (psrc + stride) | |
Loads word in 'out2' from (psrc + 2 * stride) | |
Loads word in 'out3' from (psrc + 3 * stride) | |
*/ | |
/* Description : Load double words with stride | |
Arguments : Inputs - psrc (source pointer to load from) | |
- stride | |
Outputs - out0, out1 | |
Details : Loads double word in 'out0' from (psrc) | |
Loads double word in 'out1' from (psrc + stride) | |
*/ | |
/* Description : Store 4 words with stride | |
Arguments : Inputs - in0, in1, in2, in3, pdst, stride | |
Details : Stores word from 'in0' to (pdst) | |
Stores word from 'in1' to (pdst + stride) | |
Stores word from 'in2' to (pdst + 2 * stride) | |
Stores word from 'in3' to (pdst + 3 * stride) | |
*/ | |
/* Description : Store 4 double words with stride | |
Arguments : Inputs - in0, in1, in2, in3, pdst, stride | |
Details : Stores double word from 'in0' to (pdst) | |
Stores double word from 'in1' to (pdst + stride) | |
Stores double word from 'in2' to (pdst + 2 * stride) | |
Stores double word from 'in3' to (pdst + 3 * stride) | |
*/ | |
/* Description : Load vector elements with stride | |
Arguments : Inputs - psrc (source pointer to load from) | |
- stride | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Loads elements in 'out0' from (psrc) | |
Loads elements in 'out1' from (psrc + stride) | |
*/ | |
/* Description : Store vectors with stride | |
Arguments : Inputs - in0, in1, stride | |
Outputs - pdst (destination pointer to store to) | |
Details : Stores elements from 'in0' to (pdst) | |
Stores elements from 'in1' to (pdst + stride) | |
*/ | |
/* Description : Store half word elements of vector with stride | |
* Arguments : Inputs - in source vector | |
* - pdst (destination pointer to store to) | |
* - stride | |
* Details : Stores half word 'idx0' from 'in' to (pdst) | |
* Stores half word 'idx1' from 'in' to (pdst + stride) | |
* Similar for other elements | |
*/ | |
/* Description : Store word elements of vector with stride | |
* Arguments : Inputs - in source vector | |
* - pdst (destination pointer to store to) | |
* - stride | |
* Details : Stores word 'idx0' from 'in' to (pdst) | |
* Stores word 'idx1' from 'in' to (pdst + stride) | |
* Similar for other elements | |
*/ | |
/* Description : Store double word elements of vector with stride | |
* Arguments : Inputs - in source vector | |
* - pdst (destination pointer to store to) | |
* - stride | |
* Details : Stores double word 'idx0' from 'in' to (pdst) | |
* Stores double word 'idx1' from 'in' to (pdst + stride) | |
* Similar for other elements | |
*/ | |
/* Description : Store as 12x8 byte block to destination memory from | |
input vectors | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, pdst, stride | |
Details : Index 0 double word element from input vector 'in0' is copied | |
and stored to destination memory at (pblk_12x8_m) followed by | |
index 2 word element from same input vector 'in0' at | |
(pblk_12x8_m + 8) | |
Similar to remaining lines | |
*/ | |
/* Description : average with rounding (in0 + in1 + 1) / 2. | |
Arguments : Inputs - in0, in1, in2, in3, | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Each byte element from 'in0' vector is added with each byte | |
element from 'in1' vector. The addition of the elements plus 1 | |
(for rounding) is done unsigned with full precision, | |
i.e. the result has one extra bit. Unsigned division by 2 | |
(or logical shift right by one bit) is performed before writing | |
the result to vector 'out0' | |
Similar for the pair of 'in2' and 'in3' | |
*/ | |
/* Description : Immediate number of columns to slide | |
Arguments : Inputs - s, d, slide_val | |
Outputs - out | |
Return Type - as per RTYPE | |
Details : Byte elements from 'd' vector are slide into 's' by | |
number of elements specified by 'slide_val' | |
*/ | |
/* Description : Shuffle byte vector elements as per mask vector | |
Arguments : Inputs - in0, in1, in2, in3, mask0, mask1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Selective byte elements from in0 & in1 are copied to out0 as | |
per control vector mask0 | |
Selective byte elements from in2 & in3 are copied to out1 as | |
per control vector mask1 | |
*/ | |
/* Description : Shuffle halfword vector elements as per mask vector | |
Arguments : Inputs - in0, in1, in2, in3, mask0, mask1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Selective halfword elements from in0 & in1 are copied to out0 | |
as per control vector mask0 | |
Selective halfword elements from in2 & in3 are copied to out1 | |
as per control vector mask1 | |
*/ | |
/* Description : Shuffle byte vector elements as per mask vector | |
Arguments : Inputs - in0, in1, in2, in3, mask0, mask1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Selective byte elements from in0 & in1 are copied to out0 as | |
per control vector mask0 | |
Selective byte elements from in2 & in3 are copied to out1 as | |
per control vector mask1 | |
*/ | |
/* Description : Dot product of byte vector elements | |
Arguments : Inputs - mult0, mult1 | |
cnst0, cnst1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Unsigned byte elements from mult0 are multiplied with | |
unsigned byte elements from cnst0 producing a result | |
twice the size of input i.e. unsigned halfword. | |
Then this multiplication results of adjacent odd-even elements | |
are added together and stored to the out vector | |
(2 unsigned halfword results) | |
*/ | |
/* Description : Dot product of byte vector elements | |
Arguments : Inputs - mult0, mult1 | |
cnst0, cnst1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Signed byte elements from mult0 are multiplied with | |
signed byte elements from cnst0 producing a result | |
twice the size of input i.e. signed halfword. | |
Then this multiplication results of adjacent odd-even elements | |
are added together and stored to the out vector | |
(2 signed halfword results) | |
*/ | |
/* Description : Dot product of halfword vector elements | |
Arguments : Inputs - mult0, mult1 | |
cnst0, cnst1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Signed halfword elements from mult0 are multiplied with | |
signed halfword elements from cnst0 producing a result | |
twice the size of input i.e. signed word. | |
Then this multiplication results of adjacent odd-even elements | |
are added together and stored to the out vector | |
(2 signed word results) | |
*/ | |
/* Description : Dot product & addition of byte vector elements | |
Arguments : Inputs - mult0, mult1 | |
cnst0, cnst1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Signed byte elements from mult0 are multiplied with | |
signed byte elements from cnst0 producing a result | |
twice the size of input i.e. signed halfword. | |
Then this multiplication results of adjacent odd-even elements | |
are added to the out vector | |
(2 signed halfword results) | |
*/ | |
/* Description : Dot product & addition of byte vector elements | |
Arguments : Inputs - mult0, mult1 | |
cnst0, cnst1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Unsigned byte elements from mult0 are multiplied with | |
unsigned byte elements from cnst0 producing a result | |
twice the size of input i.e. unsigned halfword. | |
Then this multiplication results of adjacent odd-even elements | |
are added to the out vector | |
(2 unsigned halfword results) | |
*/ | |
/* Description : Dot product & addition of halfword vector elements | |
Arguments : Inputs - mult0, mult1 | |
cnst0, cnst1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Signed halfword elements from mult0 are multiplied with | |
signed halfword elements from cnst0 producing a result | |
twice the size of input i.e. signed word. | |
Then this multiplication results of adjacent odd-even elements | |
are added to the out vector | |
(2 signed word results) | |
*/ | |
/* Description : Minimum values between unsigned elements of | |
either vector are copied to the output vector | |
Arguments : Inputs - in0, in1, min_vec | |
Outputs - in0, in1, (in place) | |
Return Type - as per RTYPE | |
Details : Minimum of unsigned halfword element values from 'in0' and | |
'min_value' are written to output vector 'in0' | |
*/ | |
/* Description : Clips all halfword elements of input vector between min & max | |
out = ((in) < (min)) ? (min) : (((in) > (max)) ? (max) : (in)) | |
Arguments : Inputs - in (input vector) | |
- min (min threshold) | |
- max (max threshold) | |
Outputs - in (output vector with clipped elements) | |
Return Type - signed halfword | |
*/ | |
/* Description : Clips all signed halfword elements of input vector | |
between 0 & 255 | |
Arguments : Inputs - in (input vector) | |
Outputs - in (output vector with clipped elements) | |
Return Type - signed halfwords | |
*/ | |
/* Description : Clips all signed word elements of input vector | |
between 0 & 255 | |
Arguments : Inputs - in (input vector) | |
Outputs - in (output vector with clipped elements) | |
Return Type - signed word | |
*/ | |
/* Description : Addition of 4 signed word elements | |
4 signed word elements of input vector are added together and | |
resulted integer sum is returned | |
Arguments : Inputs - in (signed word vector) | |
Outputs - sum_m (i32 sum) | |
Return Type - signed word | |
*/ | |
/* Description : Addition of 8 unsigned halfword elements | |
8 unsigned halfword elements of input vector are added | |
together and resulted integer sum is returned | |
Arguments : Inputs - in (unsigned halfword vector) | |
Outputs - sum_m (u32 sum) | |
Return Type - unsigned word | |
*/ | |
/* Description : Horizontal addition of signed byte vector elements | |
Arguments : Inputs - in0, in1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Each signed odd byte element from 'in0' is added to | |
even signed byte element from 'in0' (pairwise) and the | |
halfword result is stored in 'out0' | |
*/ | |
/* Description : Horizontal addition of unsigned byte vector elements | |
Arguments : Inputs - in0, in1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Each unsigned odd byte element from 'in0' is added to | |
even unsigned byte element from 'in0' (pairwise) and the | |
halfword result is stored in 'out0' | |
*/ | |
/* Description : Horizontal subtraction of unsigned byte vector elements | |
Arguments : Inputs - in0, in1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Each unsigned odd byte element from 'in0' is subtracted from | |
even unsigned byte element from 'in0' (pairwise) and the | |
halfword result is stored in 'out0' | |
*/ | |
/* Description : SAD (Sum of Absolute Difference) | |
Arguments : Inputs - in0, in1, ref0, ref1 (unsigned byte src & ref) | |
Outputs - sad_m (halfword vector with sad) | |
Return Type - unsigned halfword | |
Details : Absolute difference of all the byte elements from 'in0' with | |
'ref0' is calculated and preserved in 'diff0'. From the 16 | |
unsigned absolute diff values, even-odd pairs are added | |
together to generate 8 halfword results. | |
*/ | |
/* Description : Insert specified word elements from input vectors to 1 | |
destination vector | |
Arguments : Inputs - in0, in1, in2, in3 (4 input vectors) | |
Outputs - out (output vector) | |
Return Type - as per RTYPE | |
*/ | |
/* Description : Insert specified double word elements from input vectors to 1 | |
destination vector | |
Arguments : Inputs - in0, in1 (2 input vectors) | |
Outputs - out (output vector) | |
Return Type - as per RTYPE | |
*/ | |
/* Description : Interleave even byte elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Even byte elements of 'in0' and even byte | |
elements of 'in1' are interleaved and copied to 'out0' | |
Even byte elements of 'in2' and even byte | |
elements of 'in3' are interleaved and copied to 'out1' | |
*/ | |
/* Description : Interleave even halfword elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Even halfword elements of 'in0' and even halfword | |
elements of 'in1' are interleaved and copied to 'out0' | |
Even halfword elements of 'in2' and even halfword | |
elements of 'in3' are interleaved and copied to 'out1' | |
*/ | |
/* Description : Interleave even word elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Even word elements of 'in0' and even word | |
elements of 'in1' are interleaved and copied to 'out0' | |
Even word elements of 'in2' and even word | |
elements of 'in3' are interleaved and copied to 'out1' | |
*/ | |
/* Description : Interleave even double word elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Even double word elements of 'in0' and even double word | |
elements of 'in1' are interleaved and copied to 'out0' | |
Even double word elements of 'in2' and even double word | |
elements of 'in3' are interleaved and copied to 'out1' | |
*/ | |
/* Description : Interleave left half of byte elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Left half of byte elements of in0 and left half of byte | |
elements of in1 are interleaved and copied to out0. | |
Left half of byte elements of in2 and left half of byte | |
elements of in3 are interleaved and copied to out1. | |
*/ | |
/* Description : Interleave left half of halfword elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Left half of halfword elements of in0 and left half of halfword | |
elements of in1 are interleaved and copied to out0. | |
Left half of halfword elements of in2 and left half of halfword | |
elements of in3 are interleaved and copied to out1. | |
*/ | |
/* Description : Interleave left half of word elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Left half of word elements of in0 and left half of word | |
elements of in1 are interleaved and copied to out0. | |
Left half of word elements of in2 and left half of word | |
elements of in3 are interleaved and copied to out1. | |
*/ | |
/* Description : Interleave right half of byte elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7 | |
Outputs - out0, out1, out2, out3 | |
Return Type - as per RTYPE | |
Details : Right half of byte elements of in0 and right half of byte | |
elements of in1 are interleaved and copied to out0. | |
Right half of byte elements of in2 and right half of byte | |
elements of in3 are interleaved and copied to out1. | |
Similar for other pairs | |
*/ | |
/* Description : Interleave right half of halfword elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7 | |
Outputs - out0, out1, out2, out3 | |
Return Type - as per RTYPE | |
Details : Right half of halfword elements of in0 and right half of | |
halfword elements of in1 are interleaved and copied to out0. | |
Right half of halfword elements of in2 and right half of | |
halfword elements of in3 are interleaved and copied to out1. | |
Similar for other pairs | |
*/ | |
/* Description : Interleave right half of double word elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7 | |
Outputs - out0, out1, out2, out3 | |
Return Type - as per RTYPE | |
Details : Right half of double word elements of in0 and right half of | |
double word elements of in1 are interleaved and copied to out0. | |
Right half of double word elements of in2 and right half of | |
double word elements of in3 are interleaved and copied to out1. | |
*/ | |
/* Description : Interleave left half of double word elements from vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Left half of double word elements of in0 and left half of | |
double word elements of in1 are interleaved and copied to out0. | |
Left half of double word elements of in2 and left half of | |
double word elements of in3 are interleaved and copied to out1. | |
*/ | |
/* Description : Interleave both left and right half of input vectors | |
Arguments : Inputs - in0, in1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Right half of byte elements from 'in0' and 'in1' are | |
interleaved and stored to 'out0' | |
Left half of byte elements from 'in0' and 'in1' are | |
interleaved and stored to 'out1' | |
*/ | |
/* Description : Maximum values between signed elements of vector and | |
5-bit signed immediate value are copied to the output vector | |
Arguments : Inputs - in0, in1, in2, in3, max_val | |
Outputs - in0, in1, in2, in3 (in place) | |
Return Type - as per RTYPE | |
Details : Maximum of signed halfword element values from 'in0' and | |
'max_val' are written to output vector 'in0' | |
*/ | |
/* Description : Saturate the halfword element values to the max | |
unsigned value of (sat_val+1 bits) | |
The element data width remains unchanged | |
Arguments : Inputs - in0, in1, in2, in3, sat_val | |
Outputs - in0, in1, in2, in3 (in place) | |
Return Type - as per RTYPE | |
Details : Each unsigned halfword element from 'in0' is saturated to the | |
value generated with (sat_val+1) bit range | |
Results are in placed to original vectors | |
*/ | |
/* Description : Saturate the halfword element values to the max | |
unsigned value of (sat_val+1 bits) | |
The element data width remains unchanged | |
Arguments : Inputs - in0, in1, in2, in3, sat_val | |
Outputs - in0, in1, in2, in3 (in place) | |
Return Type - as per RTYPE | |
Details : Each unsigned halfword element from 'in0' is saturated to the | |
value generated with (sat_val+1) bit range | |
Results are in placed to original vectors | |
*/ | |
/* Description : Saturate the word element values to the max | |
unsigned value of (sat_val+1 bits) | |
The element data width remains unchanged | |
Arguments : Inputs - in0, in1, in2, in3, sat_val | |
Outputs - in0, in1, in2, in3 (in place) | |
Return Type - as per RTYPE | |
Details : Each unsigned word element from 'in0' is saturated to the | |
value generated with (sat_val+1) bit range | |
Results are in placed to original vectors | |
*/ | |
/* Description : Indexed halfword element values are replicated to all | |
elements in output vector | |
Arguments : Inputs - in, idx0, idx1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : 'idx0' element value from 'in' vector is replicated to all | |
elements in 'out0' vector | |
Valid index range for halfword operation is 0-7 | |
*/ | |
/* Description : Indexed word element values are replicated to all | |
elements in output vector | |
Arguments : Inputs - in, stidx | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : 'stidx' element value from 'in' vector is replicated to all | |
elements in 'out0' vector | |
'stidx + 1' element value from 'in' vector is replicated to all | |
elements in 'out1' vector | |
Valid index range for halfword operation is 0-3 | |
*/ | |
/* Description : Pack even byte elements of vector pairs | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Even byte elements of in0 are copied to the left half of | |
out0 & even byte elements of in1 are copied to the right | |
half of out0. | |
Even byte elements of in2 are copied to the left half of | |
out1 & even byte elements of in3 are copied to the right | |
half of out1. | |
*/ | |
/* Description : Pack even halfword elements of vector pairs | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Even halfword elements of in0 are copied to the left half of | |
out0 & even halfword elements of in1 are copied to the right | |
half of out0. | |
Even halfword elements of in2 are copied to the left half of | |
out1 & even halfword elements of in3 are copied to the right | |
half of out1. | |
*/ | |
/* Description : Pack even double word elements of vector pairs | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Even double elements of in0 are copied to the left half of | |
out0 & even double elements of in1 are copied to the right | |
half of out0. | |
Even double elements of in2 are copied to the left half of | |
out1 & even double elements of in3 are copied to the right | |
half of out1. | |
*/ | |
/* Description : Pack odd double word elements of vector pairs | |
Arguments : Inputs - in0, in1 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : As operation is on same input 'in0' vector, index 1 double word | |
element is overwritten to index 0 and result is written to out0 | |
As operation is on same input 'in1' vector, index 1 double word | |
element is overwritten to index 0 and result is written to out1 | |
*/ | |
/* Description : Each byte element is logically xor'ed with immediate 128 | |
Arguments : Inputs - in0, in1 | |
Outputs - in0, in1 (in-place) | |
Return Type - as per RTYPE | |
Details : Each unsigned byte element from input vector 'in0' is | |
logically xor'ed with 128 and result is in-place stored in | |
'in0' vector | |
Each unsigned byte element from input vector 'in1' is | |
logically xor'ed with 128 and result is in-place stored in | |
'in1' vector | |
Similar for other pairs | |
*/ | |
/* Description : Addition of signed halfword elements and signed saturation | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Return Type - as per RTYPE | |
Details : Signed halfword elements from 'in0' are added to signed | |
halfword elements of 'in1'. The result is then signed saturated | |
between -32768 to +32767 (as per halfword data type) | |
Similar for other pairs | |
*/ | |
/* Description : Shift left all elements of vector (generic for all data types) | |
Arguments : Inputs - in0, in1, in2, in3, shift | |
Outputs - in0, in1, in2, in3 (in place) | |
Return Type - as per input vector RTYPE | |
Details : Each element of vector 'in0' is left shifted by 'shift' and | |
result is in place written to 'in0' | |
Similar for other pairs | |
*/ | |
/* Description : Arithmetic shift right all elements of vector | |
(generic for all data types) | |
Arguments : Inputs - in0, in1, in2, in3, shift | |
Outputs - in0, in1, in2, in3 (in place) | |
Return Type - as per input vector RTYPE | |
Details : Each element of vector 'in0' is right shifted by 'shift' and | |
result is in place written to 'in0' | |
Here, 'shift' is GP variable passed in | |
Similar for other pairs | |
*/ | |
/* Description : Shift right logical all halfword elements of vector | |
Arguments : Inputs - in0, in1, in2, in3, shift | |
Outputs - in0, in1, in2, in3 (in place) | |
Return Type - as per RTYPE | |
Details : Each element of vector 'in0' is shifted right logical by | |
number of bits respective element holds in vector 'shift' and | |
result is in place written to 'in0' | |
Here, 'shift' is a vector passed in | |
Similar for other pairs | |
*/ | |
/* Description : Shift right arithmetic rounded halfwords | |
Arguments : Inputs - in0, in1, shift | |
Outputs - in0, in1, (in place) | |
Return Type - as per RTYPE | |
Details : Each element of vector 'in0' is shifted right arithmetic by | |
number of bits respective element holds in vector 'shift'. | |
The last discarded bit is added to shifted value for rounding | |
and the result is in place written to 'in0' | |
Here, 'shift' is a vector passed in | |
Similar for other pairs | |
*/ | |
/* Description : Shift right arithmetic rounded words | |
Arguments : Inputs - in0, in1, shift | |
Outputs - in0, in1, (in place) | |
Return Type - as per RTYPE | |
Details : Each element of vector 'in0' is shifted right arithmetic by | |
number of bits respective element holds in vector 'shift'. | |
The last discarded bit is added to shifted value for rounding | |
and the result is in place written to 'in0' | |
Here, 'shift' is a vector passed in | |
Similar for other pairs | |
*/ | |
/* Description : Shift right arithmetic rounded (immediate) | |
Arguments : Inputs - in0, in1, in2, in3, shift | |
Outputs - in0, in1, in2, in3 (in place) | |
Return Type - as per RTYPE | |
Details : Each element of vector 'in0' is shifted right arithmetic by | |
value in 'shift'. | |
The last discarded bit is added to shifted value for rounding | |
and the result is in place written to 'in0' | |
Similar for other pairs | |
*/ | |
/* Description : Shift right arithmetic rounded (immediate) | |
Arguments : Inputs - in0, in1, shift | |
Outputs - in0, in1 (in place) | |
Return Type - as per RTYPE | |
Details : Each element of vector 'in0' is shifted right arithmetic by | |
value in 'shift'. | |
The last discarded bit is added to shifted value for rounding | |
and the result is in place written to 'in0' | |
Similar for other pairs | |
*/ | |
/* Description : Multiplication of pairs of vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Details : Each element from 'in0' is multiplied with elements from 'in1' | |
and result is written to 'out0' | |
Similar for other pairs | |
*/ | |
/* Description : Addition of 2 pairs of vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Details : Each element from 2 pairs vectors is added and 2 results are | |
produced | |
*/ | |
/* Description : Subtraction of 2 pairs of vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1 | |
Details : Each element from 2 pairs vectors is subtracted and 2 results | |
are produced | |
*/ | |
/* Description : Sign extend byte elements from right half of the vector | |
Arguments : Input - in (byte vector) | |
Output - out (sign extended halfword vector) | |
Return Type - signed halfword | |
Details : Sign bit of byte elements from input vector 'in' is | |
extracted and interleaved with same vector 'in' to generate | |
8 halfword elements keeping sign intact | |
*/ | |
/* Description : Sign extend halfword elements from right half of the vector | |
Arguments : Inputs - in (input halfword vector) | |
Outputs - out (sign extended word vectors) | |
Return Type - signed word | |
Details : Sign bit of halfword elements from input vector 'in' is | |
extracted and interleaved with same vector 'in0' to generate | |
4 word elements keeping sign intact | |
*/ | |
/* Description : Sign extend byte elements from input vector and return | |
halfword results in pair of vectors | |
Arguments : Inputs - in (1 input byte vector) | |
Outputs - out0, out1 (sign extended 2 halfword vectors) | |
Return Type - signed halfword | |
Details : Sign bit of byte elements from input vector 'in' is | |
extracted and interleaved right with same vector 'in0' to | |
generate 8 signed halfword elements in 'out0' | |
Then interleaved left with same vector 'in0' to | |
generate 8 signed halfword elements in 'out1' | |
*/ | |
/* Description : Zero extend unsigned byte elements to halfword elements | |
Arguments : Inputs - in (1 input unsigned byte vector) | |
Outputs - out0, out1 (unsigned 2 halfword vectors) | |
Return Type - signed halfword | |
Details : Zero extended right half of vector is returned in 'out0' | |
Zero extended left half of vector is returned in 'out1' | |
*/ | |
/* Description : Sign extend halfword elements from input vector and return | |
result in pair of vectors | |
Arguments : Inputs - in (1 input halfword vector) | |
Outputs - out0, out1 (sign extended 2 word vectors) | |
Return Type - signed word | |
Details : Sign bit of halfword elements from input vector 'in' is | |
extracted and interleaved right with same vector 'in0' to | |
generate 4 signed word elements in 'out0' | |
Then interleaved left with same vector 'in0' to | |
generate 4 signed word elements in 'out1' | |
*/ | |
/* Description : Swap two variables | |
Arguments : Inputs - in0, in1 | |
Outputs - in0, in1 (in-place) | |
Details : Swapping of two input variables using xor | |
*/ | |
/* Description : Butterfly of 4 input vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1, out2, out3 | |
Details : Butterfly operation | |
*/ | |
/* Description : Butterfly of 8 input vectors | |
Arguments : Inputs - in0 ... in7 | |
Outputs - out0 .. out7 | |
Details : Butterfly operation | |
*/ | |
/* Description : Butterfly of 16 input vectors | |
Arguments : Inputs - in0 ... in15 | |
Outputs - out0 .. out15 | |
Details : Butterfly operation | |
*/ | |
/* Description : Transposes input 4x4 byte block | |
Arguments : Inputs - in0, in1, in2, in3 (input 4x4 byte block) | |
Outputs - out0, out1, out2, out3 (output 4x4 byte block) | |
Return Type - unsigned byte | |
Details : | |
*/ | |
/* Description : Transposes input 8x4 byte block into 4x8 | |
Arguments : Inputs - in0, in1, in2, in3 (input 8x4 byte block) | |
Outputs - out0, out1, out2, out3 (output 4x8 byte block) | |
Return Type - as per RTYPE | |
Details : | |
*/ | |
/* Description : Transposes input 8x8 byte block | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7 | |
(input 8x8 byte block) | |
Outputs - out0, out1, out2, out3, out4, out5, out6, out7 | |
(output 8x8 byte block) | |
Return Type - as per RTYPE | |
Details : | |
*/ | |
/* Description : Transposes 16x4 block into 4x16 with byte elements in vectors | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, | |
in8, in9, in10, in11, in12, in13, in14, in15 | |
Outputs - out0, out1, out2, out3 | |
Return Type - unsigned byte | |
Details : | |
*/ | |
/* Description : Transposes 16x8 block into 8x16 with byte elements in vectors | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, | |
in8, in9, in10, in11, in12, in13, in14, in15 | |
Outputs - out0, out1, out2, out3, out4, out5, out6, out7 | |
Return Type - unsigned byte | |
Details : | |
*/ | |
/* Description : Transposes 4x4 block with half word elements in vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1, out2, out3 | |
Return Type - signed halfword | |
Details : | |
*/ | |
/* Description : Transposes 8x8 block with half word elements in vectors | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7 | |
Outputs - out0, out1, out2, out3, out4, out5, out6, out7 | |
Return Type - as per RTYPE | |
Details : | |
*/ | |
/* Description : Transposes 4x4 block with word elements in vectors | |
Arguments : Inputs - in0, in1, in2, in3 | |
Outputs - out0, out1, out2, out3 | |
Return Type - signed word | |
Details : | |
*/ | |
/* Description : Average byte elements from pair of vectors and store 8x4 byte | |
block in destination memory | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, pdst, stride | |
Details : Each byte element from input vector pair 'in0' and 'in1' are | |
averaged (a + b)/2 and stored in 'tmp0_m' | |
Each byte element from input vector pair 'in2' and 'in3' are | |
averaged (a + b)/2 and stored in 'tmp1_m' | |
Each byte element from input vector pair 'in4' and 'in5' are | |
averaged (a + b)/2 and stored in 'tmp2_m' | |
Each byte element from input vector pair 'in6' and 'in7' are | |
averaged (a + b)/2 and stored in 'tmp3_m' | |
The half vector results from all 4 vectors are stored in | |
destination memory as 8x4 byte block | |
*/ | |
/* Description : Average byte elements from pair of vectors and store 16x4 byte | |
block in destination memory | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, pdst, stride | |
Details : Each byte element from input vector pair 'in0' and 'in1' are | |
averaged (a + b)/2 and stored in 'tmp0_m' | |
Each byte element from input vector pair 'in2' and 'in3' are | |
averaged (a + b)/2 and stored in 'tmp1_m' | |
Each byte element from input vector pair 'in4' and 'in5' are | |
averaged (a + b)/2 and stored in 'tmp2_m' | |
Each byte element from input vector pair 'in6' and 'in7' are | |
averaged (a + b)/2 and stored in 'tmp3_m' | |
The results from all 4 vectors are stored in destination | |
memory as 16x4 byte block | |
*/ | |
/* Description : Average rounded byte elements from pair of vectors and store | |
8x4 byte block in destination memory | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, pdst, stride | |
Details : Each byte element from input vector pair 'in0' and 'in1' are | |
average rounded (a + b + 1)/2 and stored in 'tmp0_m' | |
Each byte element from input vector pair 'in2' and 'in3' are | |
average rounded (a + b + 1)/2 and stored in 'tmp1_m' | |
Each byte element from input vector pair 'in4' and 'in5' are | |
average rounded (a + b + 1)/2 and stored in 'tmp2_m' | |
Each byte element from input vector pair 'in6' and 'in7' are | |
average rounded (a + b + 1)/2 and stored in 'tmp3_m' | |
The half vector results from all 4 vectors are stored in | |
destination memory as 8x4 byte block | |
*/ | |
/* Description : Average rounded byte elements from pair of vectors and store | |
16x4 byte block in destination memory | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, pdst, stride | |
Details : Each byte element from input vector pair 'in0' and 'in1' are | |
average rounded (a + b + 1)/2 and stored in 'tmp0_m' | |
Each byte element from input vector pair 'in2' and 'in3' are | |
average rounded (a + b + 1)/2 and stored in 'tmp1_m' | |
Each byte element from input vector pair 'in4' and 'in5' are | |
average rounded (a + b + 1)/2 and stored in 'tmp2_m' | |
Each byte element from input vector pair 'in6' and 'in7' are | |
average rounded (a + b + 1)/2 and stored in 'tmp3_m' | |
The vector results from all 4 vectors are stored in | |
destination memory as 16x4 byte block | |
*/ | |
/* Description : Average rounded byte elements from pair of vectors, | |
average rounded with destination and store 8x4 byte block | |
in destination memory | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, pdst, stride | |
Details : Each byte element from input vector pair 'in0' and 'in1' are | |
average rounded (a + b + 1)/2 and stored in 'tmp0_m' | |
Each byte element from input vector pair 'in2' and 'in3' are | |
average rounded (a + b + 1)/2 and stored in 'tmp1_m' | |
Each byte element from input vector pair 'in4' and 'in5' are | |
average rounded (a + b + 1)/2 and stored in 'tmp2_m' | |
Each byte element from input vector pair 'in6' and 'in7' are | |
average rounded (a + b + 1)/2 and stored in 'tmp3_m' | |
The half vector results from all 4 vectors are stored in | |
destination memory as 8x4 byte block | |
*/ | |
/* Description : Average rounded byte elements from pair of vectors, | |
average rounded with destination and store 16x4 byte block | |
in destination memory | |
Arguments : Inputs - in0, in1, in2, in3, in4, in5, in6, in7, pdst, stride | |
Details : Each byte element from input vector pair 'in0' and 'in1' are | |
average rounded (a + b + 1)/2 and stored in 'tmp0_m' | |
Each byte element from input vector pair 'in2' and 'in3' are | |
average rounded (a + b + 1)/2 and stored in 'tmp1_m' | |
Each byte element from input vector pair 'in4' and 'in5' are | |
average rounded (a + b + 1)/2 and stored in 'tmp2_m' | |
Each byte element from input vector pair 'in6' and 'in7' are | |
average rounded (a + b + 1)/2 and stored in 'tmp3_m' | |
The vector results from all 4 vectors are stored in | |
destination memory as 16x4 byte block | |
*/ | |
/* Description : Add block 4x4 | |
Arguments : Inputs - in0, in1, in2, in3, pdst, stride | |
Details : Least significant 4 bytes from each input vector are added to | |
the destination bytes, clipped between 0-255 and then stored. | |
*/ | |
/* Description : Dot product and addition of 3 signed halfword input vectors | |
Arguments : Inputs - in0, in1, in2, coeff0, coeff1, coeff2 | |
Outputs - out0_m | |
Return Type - signed halfword | |
Details : Dot product of 'in0' with 'coeff0' | |
Dot product of 'in1' with 'coeff1' | |
Dot product of 'in2' with 'coeff2' | |
Addition of all the 3 vector results | |
out0_m = (in0 * coeff0) + (in1 * coeff1) + (in2 * coeff2) | |
*/ | |
/* Description : Pack even elements of input vectors & xor with 128 | |
Arguments : Inputs - in0, in1 | |
Outputs - out_m | |
Return Type - unsigned byte | |
Details : Signed byte even elements from 'in0' and 'in1' are packed | |
together in one vector and the resulted vector is xor'ed with | |
128 to shift the range from signed to unsigned byte | |
*/ | |
/* Description : Converts inputs to unsigned bytes, interleave, average & store | |
as 8x4 unsigned byte block | |
Arguments : Inputs - in0, in1, in2, in3, dst0, dst1, pdst, stride | |
*/ | |
/* Description : Pack even byte elements, extract 0 & 2 index words from pair | |
of results and store 4 words in destination memory as per | |
stride | |
Arguments : Inputs - in0, in1, in2, in3, pdst, stride | |
*/ | |
/* Description : Pack even byte elements and store byte vector in destination | |
memory | |
Arguments : Inputs - in0, in1, pdst | |
*/ | |
/* Description : Horizontal 2 tap filter kernel code | |
Arguments : Inputs - in0, in1, mask, coeff, shift | |
*/ | |