Skip to content

Conversation

@weslleyspereira
Copy link
Collaborator

@weslleyspereira weslleyspereira commented May 25, 2021

Description
The default behavior of (s,d)COMBSSQ is to always update the sum V1 even if V2 is null. The problem is that it also changes the scaling factor of V1. Note that a null sum is a representable floating-point number no matter its scaling factor. For instance, if V1 = (smallScale, smallNum) and V2 = (bigNum, 0), SCOMBSSQ( V1, V2 ) may result in V1 = (bigNum, 0).

This PR:

  1. adds
*     A zero sum V2 shall not modify the scaling factor of V1
      IF( V2( 2 ).EQ.ZERO ) RETURN
*

to (s,d)COMBSSQ to prevent the scaling factor of V1 from being updated when the second sum is zero.

  1. improve the comments in the xLASSQ routines.

This PR closes #565.

@codecov
Copy link

codecov bot commented May 25, 2021

Codecov Report

Merging #569 (c55a399) into master (6ace55e) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #569   +/-   ##
=======================================
  Coverage   82.37%   82.37%           
=======================================
  Files        1894     1894           
  Lines      190677   190679    +2     
=======================================
+ Hits       157063   157065    +2     
  Misses      33614    33614           
Impacted Files Coverage Δ
SRC/classq.f90 100.00% <ø> (ø)
SRC/dlassq.f90 100.00% <ø> (ø)
SRC/slassq.f90 100.00% <ø> (ø)
SRC/zlassq.f90 98.30% <ø> (ø)
SRC/dcombssq.f 88.88% <100.00%> (+1.38%) ⬆️
SRC/scombssq.f 88.88% <100.00%> (+1.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ace55e...c55a399. Read the comment docs.

@weslleyspereira weslleyspereira merged commit bd6add2 into Reference-LAPACK:master May 25, 2021
@langou
Copy link
Contributor

langou commented May 26, 2021

Good catch.

To repeat.

For example if V1=( 1e-100, 1e+00 ) and V2=( 1e+100, 0e+00 ),
V1 represents x1 = V1(1)**2 * V1(2) = 1e-200
V2 represents x2 = V2(1)**2 * V2(2) = 0e+00
DCOMBSSQ computes a representation of x1+x2, so a representation of 1e-200. So for example V1 would be a good answer. This is what this fix does. If V2(2) = 0, then x2 = 0, so then do nothing and return V1. Sounds like a good idea.

Again, we want to do

x1 + x2 
= V1(1)**2 * V1(2) + V2(1)**2 * V2(2) 
= ( 1e-100 )**2 * ( 1e.00 ) + ( 1e+100 )**2 * ( 0e.00 ) 
= ( 1e-100 )**2 = 1e-200

the output can then be V1, so OUTPUT = ( 1e-100, 1e+00 ) which represents 1e-200.

then with

         V1( 2 ) = V2( 2 ) + ( V1( 1 ) / V2( 1 ) )**2 * V1( 2 )
         V1( 1 ) = V2( 1 )

we get

V1(2) = ( 0e+00 ) + ( 1e-100 / 1e+100 ) **2 * ( 1e+00 )
V1(2) = ( 0e+00 ) + ( 1e-200 ) **2 * ( 1e+00 )
V1(2) = ( 0e+00 ) + ( 1e-400 ) * ( 1e+00 )
V1(2) = ( 0e+00 ) + ( ⚠️ underflow ⚠️ ) * ( 1e+00 )
V1(2) = ( 0e+00 ) + ( 0.0e+00 ) * ( 1e+00 )
V1(2) = ( 0e+00 )
V1(1) = ( 1e+100 )

and so the output is OUTPUT = ( 1e+100, 0e+00 ) and that is not good.

Without overflow we would have gotten OUTPUT = ( 1e+100, 1e-400 ) which is a correct representation for 1e-200. But with overflow, well, we have problems.

Comment 1: Writing all this and I am not convinced that DCOMBSSQ does what it should. Maybe DCOMBSSQ assumes a certain range of input?

Comment 2: I see that in lots of codes (for example dlassq.f90)

   if( sumsq == zero ) scl = one

Which might have helped for the previous version of the code. It is weird that DLASSQ passes to DCOMBSSQ V2 as (bigNum, 0e+00). Is it really what is happening? Maybe, if that is the case, then we can "clean" DLASSQ so that if SUMSQ is ZERO, then SCL is ONE. That could be cleaner. It seems that the "older" DLASSQ was returning something clear, while the new DLASSQ is cleaning in input.

@langou
Copy link
Contributor

langou commented May 26, 2021

I have concerns with DCOMBSSQ and I do not understand the range of valid input.
I do not think V1=( 1e-100, 1e+50 ) and V2=( 1e+100, 1e-50 ) is valid input. (Or is it?)
But if it is, then

V1=( 1e-100, 1e+200 ) and V2=( 1e+100, 1e-200 ),
V1 represents x1 = V1(1)**2 * V1(2) = 1e+00
V2 represents x2 = V2(1)**2 * V2(2) = 1e+00
Then we
x1 + x2 = 2e+00

But

         V1( 2 ) = V2( 2 ) + ( V1( 1 ) / V2( 1 ) )**2 * V1( 2 )
         V1( 1 ) = V2( 1 )

we get

V1(2) = ( 1e-200 ) + ( 1e-100 / 1e+100 ) **2 * ( 1e+200 )
V1(2) = ( 1e-200 ) + ( 1e-200 ) **2 * ( 1e+200 )
V1(2) = ( 1e-200 ) + ( 1e-400 ) * ( 1e+200 )
V1(2) = ( 1e-200 ) + ( ⚠️ underflow ⚠️ ) * ( 1e+200 )
V1(2) = ( 1e-200 ) + ( 0.0e+00 ) * ( 1e+200 )
V1(2) = ( 1e-200 )
V1(1) = ( 1e+100 )

and so the output is OUTPUT = ( 1e+100, 1e-200 ) and that is not good, since this represents 1e+00, and we want to represent 2e+00.

Without overflow we would have gotten OUTPUT = ( 1e+100, 2e-200 ) which is a correct representation for 2e+00. But with overflow, well, we have problems.

As written above, I do not think V1=( 1e-100, 1e+200 ) and V2=( 1e+100, 1e-200 ) is a valid input data for DCOMBSSQ. But then we should probably describe and define the range of input for which this routine works.

@langou
Copy link
Contributor

langou commented May 26, 2021

Thanks to @mgates3. @mgates3 is telling me that DCOMBSSQ is coming from ScaLAPACK. This is good to know.

@langou
Copy link
Contributor

langou commented May 26, 2021

One more comment. It seems that the new DLASSQ from #494 actually might enable us to forego using DCOMBSSQ. Maybe we can go back to a LANGE() code that performs:

*
*        Find normF(A).
*
         SCALE = ZERO
         SUM = ONE
         DO 90 J = 1, N
            CALL DLASSQ( M, A( 1, J ), 1, SCALE, SUM )
   90    CONTINUE
         VALUE = SCALE*SQRT( SUM )

This is related to #290

@weslleyspereira
Copy link
Collaborator Author

Comment 2: I see that in lots of codes (for example dlassq.f90)

   if( sumsq == zero ) scl = one

Which might have helped for the previous version of the code. It is weird that DLASSQ passes to DCOMBSSQ V2 as (bigNum, 0e+00). Is it really what is happening? Maybe, if that is the case, then we can "clean" DLASSQ so that if SUMSQ is ZERO, then SCL is ONE. That could be cleaner. It seems that the "older" DLASSQ was returning something clear, while the new DLASSQ is cleaning in input.

Just one comment here. If X = { 0 }, both the older and the current version of xLASSQ return SCL = ONE and SUMSQ = ZERO. This is also not good for the current xLANGE implementations. One example in SLANGE:

V1 = { 2.64697796e-23, 1.97215226e-31 }
V2 = { 1.0, 0.0 }

results in

V1( 2 ) = V2( 2 ) + ( V1( 1 ) / V2( 1 ) )**2 * V1( 2 )
        =    0    + ( 2.64697796e-23 / 1.0 )**2 * 1.97215226e-31
        =    0    + UNDERFLOW
V2( 1 ) = V2( 1 ) = 1.0

This is exactly what happens with @christoph-conrads's example from #565 (comment). Whenever V1 is a sum of squares that underflows, a scale 1.0 in V2 figures as a big number.

@julielangou julielangou added this to the LAPACK 3.10.0 milestone Nov 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

xLANGE underflows for finite 1x2 matrix

3 participants