Quantcast
Channel: ListenData
Viewing all articles
Browse latest Browse all 425

SAS : Find Variable with the Max or Min in a row

$
0
0
This tutorial demonstrates how to find a variable with the maximum or minimum value for each row (observation) in SAS. It's pretty straightforward to calculate max or min value but a little problematic to identify the variable name against the value.

Let's create Sample Data
data readin;
input y1-y6;
cards;
11 55 59 35 25 87
12 79 73 74 86 29
13 80 95 77 25 74
;
run;
Find Variable Name Containing Max Value in a row
data out;
set readin;
array values y1-y6;
largest = max(of values[*]);
index    = whichn(largest, of values[*]);
name = vname(values[index]);
proc print;
run;
Output

Explanation :
  1. array values y1-y6 : Lists all the variables for calculating max value
  2. max() function calculates maximum value of all the variables listed in step 1 across rows
  3. whichn() function returns the column index number of the matching value. In this case, it is searching the maximum value across rows and returns the column position in the listed variables. For example, it returns 6 in row 1 as 87 is the maximum value in row 1 and it is placed at 6th column of y1-y6
  4. vname() function returns the variable name. In this case, it calculates variable name of the largest value.

To Find Variable Name containing minimum value
Use min() function instead of max() in the code above and the remaining code would be same.

Find Top 3 Variables

Suppose you are asked to identify top 3 variables across rows. You can use largest function in SAS.

LARGEST Function : Syntax
largest(k, variables)
k : kth value you want (2 for the second largest value)
data want;
   set readin;
   array values[*] y1-y6;
   array large[3];
   array names[3] $32;
   do i = 1 to dim(large);
      large[i] = largest(i,of values[*]);
      index    = whichn(large[i],of values[*]);
      names[i] = vname(values[index]);
   end;
   drop i index;
proc print;
run;
SAS : LARGEST values

Explanation :
  1. array values[*] y1-y6  -  Specify all the variables from which you want to calculate top 3 variables
  2. array large[3] - Top 3 large values
  3. array names[3] $32 - Names of top 3 variables
  4. do i = 1 to dim(large) - 3 iterations for calculating first, second and third largest values
  5. large[i] = largest(i,of values[*]) - largest value when i =1, second largest when i =2 and so on.
  6. index = whichn(large[i],of values[*]) : Column index for kth largest values
  7. names[i]    = vname(values[index]) :  Extract the variable name of the largest using largest index
  8. drop i index; : Dropping Irrelevant Variables

Find Bottom 3 Variables
Refer the code above and change largest() function to smallest() function.

Viewing all articles
Browse latest Browse all 425

Trending Articles