[openal] [PATCH] Add AVX linear resampler

Mariusz 'MX' Szaflik mxadd at o2.pl
Sun Jun 8 11:55:59 EDT 2014


W dniu 2014-06-08 07:01, Timothy Arceri pisze:
> On Sat, 2014-06-07 at 21:36 -0700, Chris Robinson wrote:
>> On 06/07/2014 02:59 AM, Timothy Arceri wrote:
>>> +    float const fracOne = 1.0f/FRACTIONONE;
>>> +    ...
>>> +    const __m256 fracOne8 = _mm256_broadcast_ss(&fracOne);
>> I'm a bit confused by this. According to Intel's docs for
>> _mm256_broadcast_ss and _mm_broadcast_ss:
>>
>> <https://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_avx_broadcast_ss.htm>
>>
>> the given parameter is a "pointer to a memory location that can hold
>> constant 256-bit or 128-bit float32 values". In that case, isn't a raw
>> float too small? It may also not be aligned correctly. Though if it's
>> only taking a single float value, I don't see why it would have to be
>> 256/128 bits large. Maybe its in error in their docs.
>>
> I think it must be an error. This example from the intel website just
> uses a plain float:
>
> https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions
>
>> Perhaps it would be better to do:
>>
>> const __m256 fracOne8 = _mm256_set1_ps(1.0f/FRACTIONONE);
>>
>> GCC should actually turn that call into a compile-time vector constant
>> and simply move it into a register as needed (it does for _mm_set1_ps
>> and _mm_set1_epi32 when given a compile-time constant).
> Yeah thats fine. I just used _mm256_broadcast_ss as I read that its
> faster but it was never going to matter much anyway as its not inside
> the critical loop.

_mm256_set1_ps is a 'compound' - it is not mapped direcly into any AVX instruction
its implemented like this: (pseudo code)
__m256 _mm256_set1_ps(float v)
{ return _mm_broadcast_ss(&v) }

it is actually better (for cache line polution) to use _mm_broadcast_ss 
'in place' couse the read memory is 8 times smaller (less polution on 
cache) than using vector constant set using _mm256_set1_ps
(but the real life difference will probably be minimal if any)

>
>
>
> _______________________________________________
> openal mailing list
> openal at openal.org
> http://openal.org/mailman/listinfo/openal



More information about the openal mailing list