Skip to content

Conversation

@weslleyspereira
Copy link
Collaborator

@weslleyspereira weslleyspereira commented Feb 19, 2021

Edward Anderson stated in his paper https://doi.org/10.1145/3061665 the following:

The square root of a sum of squares is well known to be prone to overflow and underflow. Ad hoc scaling
of intermediate results, as has been done in numerical software such as the BLAS and LAPACK, mostly
avoids the problem, but it can still occur at extreme values in the range of representable numbers. More
careful scaling, as has been implemented in recent versions of the standard algorithms, may come at the
expense of performance or clarity. This work reimplements the vector 2-norm and the generation of Givens
rotations from the Level 1 BLAS to improve their performance and design. In addition, support for negative
increments is extended to the Level 1 BLAS operations on a single vector, and a comprehensive test suite
for all the Level 1 BLAS is included.

  • This PR replaces the original xLARTG with Edward's safe scaling xLARTG codes.

  • Since the codes were written in the Fortran90 standard, I removed

# Tell CMake that our Fortran sources are written in fixed format.
set(CMAKE_Fortran_FORMAT FIXED)

from the CMakeLists.txt. The compiler now determines the Fortran layout by the file extension.

@codecov
Copy link

codecov bot commented Feb 19, 2021

Codecov Report

Merging #493 (92cd6f1) into master (d71dd6e) will decrease coverage by 0.09%.
The diff coverage is 92.16%.

❗ Current head 92cd6f1 differs from pull request most recent head 415e140. Consider uploading reports for the commit 415e140 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #493      +/-   ##
==========================================
- Coverage   83.41%   83.31%   -0.10%     
==========================================
  Files        1821     1820       -1     
  Lines      185725   170835   -14890     
==========================================
- Hits       154914   142334   -12580     
+ Misses      30811    28501    -2310     
Impacted Files Coverage Δ
SRC/zlartg.f90 87.50% <87.50%> (ø)
SRC/clartg.f90 89.28% <89.28%> (ø)
SRC/dlartg.f90 100.00% <100.00%> (ø)
SRC/slartg.f90 100.00% <100.00%> (ø)
SRC/dlaqr1.f 47.61% <0.00%> (-6.23%) ⬇️
SRC/slaqr1.f 47.61% <0.00%> (-6.23%) ⬇️
SRC/dlals0.f 74.73% <0.00%> (-4.76%) ⬇️
SRC/slals0.f 74.73% <0.00%> (-4.76%) ⬇️
SRC/chegst.f 86.44% <0.00%> (-4.58%) ⬇️
SRC/zhegst.f 86.44% <0.00%> (-4.58%) ⬇️
... and 1819 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b07c5e...415e140. Read the comment docs.

@weslleyspereira weslleyspereira changed the title Adds safe scaling xLARTG from https://doi.org/10.1145/3061665 Add safe scaling xLARTG from https://doi.org/10.1145/3061665 Feb 19, 2021
@langou
Copy link
Contributor

langou commented Feb 19, 2021

So this is related to #419 when we had potential infinite loop. Ed's code does not have infinite loops. (Since there is no loop!) I think it was good to have PR #419. Now I also think that Ed's codes are better and we should move to this. Ed gave us quite some codes, so this PR is a first attempt to integrate in a limited fashion the codes. In addition, Ed gave us some nice testing for these subroutines. That quite exercises the range of the IEEE numbers. The code are in F90. The syntax is not the one from legacy LAPACK. This sounds good to me. Comments on the PR welcome.

It could be good to check #411 with this set of codes work with the matrix in #411.

@weslleyspereira weslleyspereira marked this pull request as draft February 19, 2021 21:18
@ilayn
Copy link
Contributor

ilayn commented Feb 20, 2021

You might also find this one as an interesting read (in case you already haven't) https://arxiv.org/pdf/1904.09481

@weslleyspereira weslleyspereira marked this pull request as ready for review February 22, 2021 19:48
@martin-frbg
Copy link
Collaborator

Minor comment - to my knowledge, "DEPRECATED" has been used for functions that got removed from canonical LAPACK for some reason, but may still be required by older software. Accordingly there is a BUILD_DEPRECATED option to revive them. I do not think there is an "attic" for superseded algorithms, and perhaps with git it is not needed.

@thijssteel
Copy link
Collaborator

I also don't think the deprecation is necessary. The interface is exactly the same, most users won't (and shouldn't) notice the change.

The code looks good, i don't mind fortran 90. I'm a bit worried about the documentation though, can doxygen parse that different format?

@langou
Copy link
Contributor

langou commented Feb 23, 2021

Hi @martin-frbg and @thijssteel,

@martin-frbg describes this as "an "attic" for superseded algorithms", yes that was exactly the intent of placing the superseded routines in DEPRECATED. Maybe we should have an ATTIC folder then. (Not kidding.) I am fine either ways on this issue. DEPRECATED, ATTIC, or not storing them. Since both @martin-frbg and @thijssteel seem in favor to not storing, fine with me. Probably cleaner to just move on, and remove the routines from the distribution and the master plain and simple. OK with me.

@thijssteel: "I'm a bit worried about the documentation though, can doxygen parse that different format?" => oh yes, good point, we need to make these routines "doxygen" friendly. And in particular the tags [in], [out], etc. are used by quite a few parser to automagically generate some wrappers, or what not. So, yes, I agree with @thijssteel, we need to add the "doxygen" preamble.

Cheers,
Julien.

@thijssteel
Copy link
Collaborator

Some more remarks:

  • Settings the parameters directly via a module is nice and clean ( probably faster too ), but i'm worried about this breaking some archaic non-ieee systems.
  • My professor always told me that real(wp) and double precision are not truly equivalent. Similar to the first remark, this should be fine for almost all systems, but some archaic non-ieee systems might differ.

@weslleyspereira
Copy link
Collaborator Author

  1. Yes... I agree with @martin-frbg and @thijssteel, there is no need to keep the two variants if we are covered by the version control system.
  2. I can work on real(wp) -> double precision to keep the code homogenized.
  3. I am trying a different strategy to include the scaling constants in the BLAS++ (see utils.hh in https://bitbucket.org/icl/blaspp/pull-requests/30). Here I just tried to be as closer as possible from Ed's code. Based on @thijssteel's feedback and on Add safe scaling (c,z)ROT routines from https://doi.org/10.1145/3061665 #496 (review), I think we could modify xLAMCH to include the new scaling constants and, thus, avoid the including modules. What do you think?

Some other questions:

  • Do you know an automatic way to build a Doxygen-compatible header? I use Qt Creator as IDE, and it has a Doxygen plugin, but I would like your opinion.
  • BTW, do you use an IDE for developing code for Lapack?

Thanks!

@thijssteel
Copy link
Collaborator

To be clear, i'm not against modules. I just expressed some (minor) concern that eps may not always be the same on some systems.

A better solution might be determining them on compilation time if that's possible.

And even then, do we even care about those machines?

@thijssteel
Copy link
Collaborator

BTW, do you use an IDE for developing code for Lapack?

A bit out of scope for this PR, but anyway. I use vscode, with some plugins you can get syntax highlighting and some error checking. I also wrote https://github.com/thijssteel/fortranfixedformlinter. It's a really basic linter that can automatically handle line continuations, but its a bit too agressive for LAPACK.

@weslleyspereira
Copy link
Collaborator Author

To be clear, i'm not against modules. I just expressed some (minor) concern that eps may not always be the same on some systems.

A better solution might be determining them on compilation time if that's possible.

And even then, do we even care about those machines?

I'm sorry if I expressed myself badly. Let me give a more complete explanation about what I meant:

  1. I am trying to apply a C++ alternative, where the constants can be computed at compile time (that code is still under review).
  2. xLAMCH computes some scaling constants directly from the INTRINSIC functions, and others, 'E', 'S', and 'P', involve additional floating-point operations.
  3. I understand using a module with the scaling constants is good for performance. Thanks to the discussion here and in Add safe scaling (c,z)ROT routines from https://doi.org/10.1145/3061665 #496, I also understand we should be careful about including them so as not to generate inconsistent nor incompatible code. That is why I suggested including the safe scaling constants as additional options to xLAMCH, and let the modules for a separate discussion. Or maybe we include them in this PR, but the main contributions here the safe scaling xLARTG algorithms I think.

@weslleyspereira
Copy link
Collaborator Author

The code looks good, i don't mind fortran 90. I'm a bit worried about the documentation though, can doxygen parse that different format?

I created a Doxygen preamble for slartg.f90 in 8be401e. I will now apply the changes for the remaining files. Tell me if you have any suggestions. Thanks!

@weslleyspereira weslleyspereira marked this pull request as draft March 1, 2021 15:14
@weslleyspereira
Copy link
Collaborator Author

In response to @langou's comment:

It could be good to check #411 with this set of codes work with the matrix in #411.

I created #502 to fix #411.
PR #493 can't fix #411 because the Inf is generated in CTGSJA before calling SLARTG.

@weslleyspereira weslleyspereira marked this pull request as ready for review March 3, 2021 23:17
@weslleyspereira
Copy link
Collaborator Author

I have just added Doxygen preambles to all new files. The la_constants module(s) depend(s) on #501.

@weslleyspereira weslleyspereira force-pushed the try-xLARTG-from-Edward-Anderson branch from 92cd6f1 to f1550e0 Compare March 31, 2021 22:18
langou
langou previously approved these changes Apr 6, 2021
@langou langou merged commit 799ef93 into master Apr 12, 2021
christoph-conrads pushed a commit to christoph-conrads/lapack that referenced this pull request May 23, 2021
@weslleyspereira weslleyspereira deleted the try-xLARTG-from-Edward-Anderson branch August 30, 2021 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants