Flame Logo

Top of Form

Bottom of Form

Top of Form

Bottom of Form

Linear Algebra - A modern presentation of linear algebra

by Martha Ganser and Robert van de Geijn


Highlights:


Cholesky factorization
LU factorization
This

Contents

  1. Purpose
  2. Navigating the Wiki
  3. Notation
  4. APIs
    1. FLAME@lab
    2. FLAME/C
    3. FLAME/F
  5. Performance
  6. Future architectures
  7. Sponsors
  8. Contact Us

Purpose

This wiki is an ever-growing repository of information related to linear algebra theory, practice, and experience. It is our hope that this makes the material more accessible to expert and novice alike. We hope it will become the Wikipedia of linear algebra.

Navigating the Wiki

This Wiki can be used in a number of different ways:

More Help

Notation

Learn more

          


APIs

When appropriate, implementations are given using a number of Application Programming Interfaces (APIs) that allow code to closely mirror the algorithms as presented in the FLAME notation.

FLAME@lab

FLAME@lab is the API for Octave's Mscript (free Matlab!), LabView's MathScript, and Matlab Mscript. Using this API, the above LU factorization is coded as

Try it


Learn more

          

o    function [ A_out ] = LU_unb_var5( A )

o     

o      [ ATL, ATR, ...

o        ABL, ABR ] = FLA_Part_2x2( A, 0, 0, 'FLA_TL' );

o     

o      while ( size( ATL, 1 ) < size( A, 1 ) & size( ATL, 2 ) < size( A, 2 ) )

o     

o        [ A00,  a01,     A02,  ...

o          a10t, alpha11, a12t, ...

o          A20,  a21,     A22 ] = FLA_Repart_2x2_to_3x3( ATL, ATR, ...

o                                                        ABL, ABR, 1, 1, 'FLA_BR' );

o        %------------------------------------------------------------%

o     

o        a21 = a21 / alpha11;

o        A22 = A22 - a21 * a12t;

o     

o        %------------------------------------------------------------%

o        [ ATL, ATR, ...

o          ABL, ABR ] = FLA_Cont_with_3x3_to_2x2( A00,  a01,     A02,  ...

o                                                 a10t, alpha11, a12t, ...

o                                                 A20,  a21,     A22, 'FLA_TL' );

o      end

o     

o      A_out = [ ATL, ATR

o                ABL, ABR ];

o     

o    return

FLAME/C

A similar API for the C programming language supports the following code for LU factorization:

 


Learn more

          

o    #include "FLAME.h"

o     

o    FLA_Error LU_unb_var5( FLA_Obj A )

o    {

o      FLA_Obj ATL,   ATR,      A00,  a01,     A02,

o              ABL,   ABR,      a10t, alpha11, a12t,

o                               A20,  a21,     A22;

o     

o      FLA_Part_2x2( A,    &ATL, &ATR,

o                          &ABL, &ABR,     0, 0, FLA_TL );

o     

o      while ( FLA_Obj_length( ATL ) < FLA_Obj_length( A ) ){

o     

o        FLA_Repart_2x2_to_3x3( ATL, /**/ ATR,       &A00,  /**/ &a01,     &A02,

o                            /* ************* */   /* *************************** */

o                                                    &a10t, /**/ &alpha11, &a12t,

o                               ABL, /**/ ABR,       &A20,  /**/ &a21,     &A22,

o                               1, 1, FLA_BR );

o     

o        /*------------------------------------------------------------*/

o     

o        /* a21 = a21 / alpha11; */

o        FLA_Inv_scal( alpha11, a21 );

o     

o        /* A22 = A22 - a21 * a12t; */

o        FLA_Ger( FLA_MINUS_ONE, a21, a12t, A22 );

o     

o        /*------------------------------------------------------------*/

o     

o        FLA_Cont_with_3x3_to_2x2( &ATL, /**/ &ATR,       A00,  a01,     /**/ A02,

o                                                         a10t, alpha11, /**/ a12t,

o                                /* ************** */  /* ************************* */

o                                  &ABL, /**/ &ABR,       A20,  a21,     /**/ A22,

o                                  FLA_TL );

o      }

o     

o      return FLA_SUCCESS;

o    }


FLAME/F

You asked for it, you got it! An API for FORTRAN supports the following code for LU factorization:

          

o          SUBROUTINE LU_unb_var1_f( A, ierror )

o          implicit none

o          include "FLAMEf.h"

o    !

o          INTEGER A( FLA_OBJ_SIZE ), IERROR

o    !

o          INTEGER

o         &  ATL( FLA_OBJ_SIZE),   ATR( FLA_OBJ_SIZE),  

o         &  ABL( FLA_OBJ_SIZE),   ABR( FLA_OBJ_SIZE),

o    !     

o         &  A00( FLA_OBJ_SIZE), a01( FLA_OBJ_SIZE),    A02( FLA_OBJ_SIZE),

o         &  a10t( FLA_OBJ_SIZE),alpha11( FLA_OBJ_SIZE),a12t( FLA_OBJ_SIZE),

o         &  A20( FLA_OBJ_SIZE), a21( FLA_OBJ_SIZE),    A22( FLA_OBJ_SIZE)

o    !

o    !

o    !

o    !

o    !

o          call FLA_Part_2x2_f( A,    ATL, ATR,

o         &                           ABL, ABR,     0, 0, FLA_TL, ierror )

o    !

o          do while ( FLA_Obj_length_f( ATL ) .lt. FLA_Obj_length_f( A ) )

o             call FLA_Repart_2x2_to_3x3_f(

o         &               ATL, ATR,      A00,  a01,     A02,

o         &                              a10t, alpha11, a12t,

o         &               ABL, ABR,      A20,  a21,     A22,

o         &               1, 1, FLA_BR, ierror )

o    !    ------------------------------------------------------------

o    !

o    !        a21 = a21 / alpha11

o    !

o             call FLA_Inv_scal_f( alpha11, a21, ierror );

o    !

o    !        A22 = A22 - a21 * a12t;

o    !

o             call FLA_Ger_f( FLA_MINUS_ONE, a21, a12t, A22, ierror );

o    !

o    !    ------------------------------------------------------------

o             call FLA_Cont_with_3x3_to_2x2_f(

o         &              ATL, ATR,      A00,  a01,     A02,

o         &                             a10t, alpha11, a12t,

o         &              ABL, ABR,      A20,  a21,     A22,

o         &              FLA_TL, ierror );

o          enddo

o    !

o          ierror = FLA_SUCCESS

o    !

o          return

o          end


Performance

We place a lot of emphasis on the fact that there are typically multiple algorithms for computing a given linear algebra operation. One reason is that under different circumstances different algorithms may perform better. Here "perform" may mean that they yield more accurate answers in the presense of round-off errors and/or that they produce the answer in less time.

The following graph illustrates the different rates of computation achieved by different algorithmic variants for LU factorization.

Learn more


Future architectures

While this wiki is primarily meant as a pedagogical tool, the techniques that are discussed, developed as part of the FLAME project, are actually very powerful. Below we report performance of the Cholesky factorization when multiple GPUs are employed.

Learn more


Sponsors

This wiki is sponsored in part by

Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

Contact Us