SAS SCORE Procedure User’s Guide

SAS SCORE Procedure User’s Guide

Below you will find brief information for SCORE Procedure. The SCORE procedure multiplies values from two SAS data sets, one containing coefficients (for example, factor-scoring coefficients or regression coefficients) and the other containing raw data to be scored using the coefficients from the first data set. The result of this multiplication is a SAS data set containing linear combinations of the coefficients and the raw data values. The data set can contain the original data used to calculate the scoring coefficients, or it can contain an entirely different data set. The scoring coefficients and the variables in the raw data set that are used in scoring must have the same names.

advertisement

Assistant Bot

Need help? Our chatbot has already read the manual and is ready to assist you. Feel free to ask any questions about the device, but providing details will make the conversation more productive.

SAS SCORE Procedure User's Guide - SAS | Manualzz
®
SAS/STAT 13.1 User’s Guide
The SCORE Procedure
This document is an individual chapter from SAS/STAT® 13.1 User’s Guide.
The correct bibliographic citation for the complete manual is as follows: SAS Institute Inc. 2013. SAS/STAT® 13.1 User’s Guide.
Cary, NC: SAS Institute Inc.
Copyright © 2013, SAS Institute Inc., Cary, NC, USA
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by
any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute
Inc.
For a web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time
you acquire this publication.
The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is
illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic
piracy of copyrighted materials. Your support of others’ rights is appreciated.
U.S. Government License Rights; Restricted Rights: The Software and its documentation is commercial computer software
developed at private expense and is provided with RESTRICTED RIGHTS to the United States Government. Use, duplication or
disclosure of the Software by the United States Government is subject to the license terms of this Agreement pursuant to, as
applicable, FAR 12.212, DFAR 227.7202-1(a), DFAR 227.7202-3(a) and DFAR 227.7202-4 and, to the extent required under U.S.
federal law, the minimum restricted rights as set out in FAR 52.227-19 (DEC 2007). If FAR 52.227-19 is applicable, this provision
serves as notice under clause (c) thereof and no other notice is required to be affixed to the Software or documentation. The
Government’s rights in Software and documentation shall be only those set forth in this Agreement.
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513-2414.
December 2013
SAS provides a complete selection of books and electronic products to help customers use SAS® software to its fullest potential. For
more information about our offerings, visit support.sas.com/bookstore or call 1-800-727-3228.
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the
USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
Gain Greater Insight into Your
SAS Software with SAS Books.
®
Discover all that you need on your journey to knowledge and empowerment.
support.sas.com/bookstore
for additional books and resources.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are
trademarks of their respective companies. © 2013 SAS Institute Inc. All rights reserved. S107969US.0613
Chapter 86
The SCORE Procedure
Contents
Overview: SCORE Procedure . . . . . . . . . . . . . .
Raw Data Set . . . . . . . . . . . . . . . . . . . .
Scoring Coefficients Data Set . . . . . . . . . . .
Standardization of Raw Data . . . . . . . . . . . .
Getting Started: SCORE Procedure . . . . . . . . . . .
Syntax: SCORE Procedure . . . . . . . . . . . . . . . .
PROC SCORE Statement . . . . . . . . . . . . .
BY Statement . . . . . . . . . . . . . . . . . . .
ID Statement . . . . . . . . . . . . . . . . . . . .
VAR Statement . . . . . . . . . . . . . . . . . . .
Details: SCORE Procedure . . . . . . . . . . . . . . . .
Missing Values . . . . . . . . . . . . . . . . . . .
Regression Parameter Estimates from PROC REG
Output Data Set . . . . . . . . . . . . . . . . . .
Computational Resources . . . . . . . . . . . . .
Examples: SCORE Procedure . . . . . . . . . . . . . .
Example 86.1: Factor Scoring Coefficients . . . .
Example 86.2: Regression Parameter Estimates .
Example 86.3: Custom Scoring Coefficients . . .
References . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7343
7344
7344
7344
7345
7350
7350
7351
7352
7352
7353
7353
7353
7353
7354
7354
7354
7359
7364
7365
Overview: SCORE Procedure
The SCORE procedure multiplies values from two SAS data sets, one containing coefficients (for example,
factor-scoring coefficients or regression coefficients) and the other containing raw data to be scored using
the coefficients from the first data set. The result of this multiplication is a SAS data set containing linear
combinations of the coefficients and the raw data values.
Many statistical procedures output coefficients that PROC SCORE can apply to raw data to produce scores.
The new score variable is formed as a linear combination of raw data and scoring coefficients. For each
observation in the raw data set, PROC SCORE multiplies the value of a variable in the raw data set by the
matching scoring coefficient from the data set of scoring coefficients. This multiplication process is repeated
for each variable in the VAR statement. The resulting products are then summed to produce the value of the
new score variable. This entire process is repeated for each observation in the raw data set. In other words,
PROC SCORE cross multiplies part of one data set with another.
7344 F Chapter 86: The SCORE Procedure
Raw Data Set
The raw data set can contain the original data used to calculate the scoring coefficients, or it can contain
an entirely different data set. The raw data set must contain all the variables needed to produce scores. In
addition, the scoring coefficients and the variables in the raw data set that are used in scoring must have the
same names. See the section “Getting Started: SCORE Procedure” on page 7345 for more information.
Scoring Coefficients Data Set
The data set containing scoring coefficients must contain two special variables: the _TYPE_ variable and the
_NAME_ or _MODEL_ variable.
• The _TYPE_ variable identifies the observations that contain scoring coefficients.
• The _NAME_ or _MODEL_ variable provides a SAS name for the new score variable.
PROC SCORE first looks for a _NAME_ variable in the SCORE= input data set. If there is such a variable,
the variable’s value is what SCORE uses to name the new score variable. If the SCORE= data set does not
have a _NAME_ variable, then PROC SCORE looks for a _MODEL_ variable.
For example, PROC FACTOR produces an output data set that contains factor-scoring coefficients. In this
output data set, the scoring coefficients are identified by _TYPE_=’SCORE’. For _TYPE_=’SCORE’, the
_NAME_ variable has values of ’Factor1’, ’Factor2’, and so forth. PROC SCORE gives the new score
variables the names Factor1, Factor2, and so forth.
As another example, the REG procedure produces an output data set that contains parameter estimates. In
this output data set, the parameter estimates are identified by _TYPE_=’PARMS’. The _MODEL_ variable
contains the label used in the MODEL statement in PROC REG, or it uses MODELn if no label is specified.
This label is the name PROC SCORE gives to the new score variable.
Standardization of Raw Data
PROC SCORE automatically standardizes or centers the DATA= variables for you, based on information
from the original variables and analysis from the SCORE= data set.
If the SCORE= scoring coefficients data set contains observations with _TYPE_=’MEAN’ and
_TYPE_=’STD’, then PROC SCORE standardizes the raw data before scoring. For example, this
type of SCORE= data set can come from PROC PRINCOMP without the COV option.
If the SCORE= scoring coefficients data set contains observations with _TYPE_=’MEAN’ but _TYPE_=’STD’
is absent, then PROC SCORE centers the raw data (the means are subtracted) before scoring. For example,
this type of SCORE= data set can come from PROC PRINCOMP with the COV option.
If the SCORE= scoring coefficients data set does not contain observations with _TYPE_=’MEAN’ and
_TYPE_=’STD’, or if you use the NOSTD option, then PROC SCORE does not center or standardize the raw
data.
Getting Started: SCORE Procedure F 7345
If the SCORE= scoring coefficients are obtained from observations with _TYPE_=’USCORE’, then
PROC SCORE “standardizes” the raw data by using the uncorrected standard deviations identified by
_TYPE_=’USTD’, and the means are not subtracted from the raw data. For example, this type of
SCORE= data set can come from PROC PRINCOMP with the NOINT option. For more information
about _TYPE_=’USCORE’ scoring coefficients in TYPE=UCORR or TYPE=UCOV output data sets, see
Appendix A, “Special SAS Data Sets.”
You can use PROC SCORE to score the data that were also used to generate the scoring coefficients, although
more typically, scoring results are directly obtained from the OUT= data set in a procedure that computes
scoring coefficients. When scoring new data, it is important to realize that PROC SCORE assumes that the
new data have approximately the same scales as the original data. For example, if you specify the COV
option with PROC PRINCOMP for the original analysis, the scoring coefficients in the PROC PRINCOMP
OUTSTAT= data set are not appropriate for standardized data. With the COV option, PROC PRINCOMP will
not output _TYPE_=’STD’ observations to the OUTSTAT= data set, and PROC SCORE will only subtract the
means of the original (not new) variables from the new variables before multiplying. Without the COV option
in PROC PRINCOMP, both the original variable means and standard deviations will be in the OUTSTAT=
data set, and PROC SCORE will subtract the original variable means from the new variables and divide them
by the original variable standard deviations before multiplying.
In general, procedures that output scoring coefficients in their OUTSTAT= data sets provide the necessary information for PROC SCORE to determine the appropriate standardization. However, if you use
PROC SCORE with a scoring coefficients data set that you constructed without _TYPE_=’MEAN’ and
_TYPE_=’STD’ observations, you might have to do the relevant centering or standardization of the new data
first. If you do this, you must use the means and standard deviations of the original variables—that is, the
variables that were used to generate the coefficients—not the means and standard deviations of the variables
to be scored.
See the section “Getting Started: SCORE Procedure” on page 7345 for further illustration.
Getting Started: SCORE Procedure
The SCORE procedure multiplies the values from two SAS data sets and creates a new data set to contain the
results of the multiplication. The variables in the new data set are linear combinations of the variables in the
two input data sets. Typically, one of these data sets contains raw data that you want to score, and the other
data set contains scoring coefficients.
The following example demonstrates how to use the SCORE procedure to multiply values from two SAS
data sets, one containing factor-scoring coefficients and the other containing raw data to be scored using the
scoring coefficients.
Suppose you are interested in the performance of three different types of schools: private schools, state-run
urban schools, and state-run rural schools. You want to compare the schools’ performances as measured by
student grades on standard tests in English, mathematics, and biology. You administer these tests and record
the scores for each of the three types of schools.
The following DATA step creates the SAS data set Schools. The data are provided by Chaseling (1996).
7346 F Chapter 86: The SCORE Procedure
data Schools;
input Type $ English Math Biology @@;
datalines;
p 52 55 45 p 42 49 40 p 63 64
p 47 50 51 p 64 69 47 p 63 67
p 59 63 42 p 56 61 41 p 41 44
p 39 42 45 p 56 63 44 p 63 73
54
54
72
42
... more lines ...
r
;
50
47
49
r
55
48
46
r
38
36
51
The data set Schools contains the character variable Type, which represents the type of school. Valid values
are p (private schools), r (state-run rural schools), and u (state-run urban schools).
The three numeric variables in the data set are English, Math, and Biology, which represent the student
scores for English, mathematics, and biology, respectively. The double trailing at sign (@@) in the INPUT
statement specifies that observations are input from each line until all values are read.
The following statements invoke the FACTOR procedure to compute the data set of factor scoring coefficients.
The statements perform a principal components factor analysis that uses all three numeric variables in the
SAS data set Schools. The OUTSTAT= option requests that PROC FACTOR output the factor scores to the
data set Scores. The NOPRINT option suppresses display of the output.
proc factor data=Schools score outstat=Scores noprint;
var english math biology;
run;
proc score data=schools score=Scores out=New;
var english math biology;
id type;
run;
Getting Started: SCORE Procedure F 7347
The SCORE procedure is then invoked using Schools as the raw data set to be scored and Scores as the
scoring data set. The OUT= option creates the SAS data set New to contain the linear combinations.
The VAR statement specifies that the variables English, Math, and Biology are used in computing scores. The
ID statement copies the variable Type from the Schools data set to the output data set New.
The following statements print the SAS output data set Scores, the first two observations from the original
data set Schools, and the first two observations of the resulting data set New.
title 'OUTSTAT= Data Set from PROC FACTOR';
proc print data=Scores;
run;
title 'First Two Observations of the DATA= Data Set from PROC SCORE';
proc print data=Schools(obs=2);
run;
title 'First Two Observations of the OUT= Data Set from PROC SCORE';
proc print data=New(obs=2);
run;
Figure 86.1 displays the output data set Scores produced by the FACTOR procedure. The last observation
(number 11) contains the scoring coefficients (_TYPE_=’SCORE’). Only one factor has been retained.
Figure 86.1 Listing of the Data Set Created by PROC FACTOR
OUTSTAT= Data Set from PROC FACTOR
Obs
1
2
3
4
5
6
7
8
9
10
11
_TYPE_
MEAN
STD
N
CORR
CORR
CORR
COMMUNAL
PRIORS
EIGENVAL
PATTERN
SCORE
_NAME_
English
Math
Biology
Factor1
Factor1
English
Math
Biology
55.525
12.949
120.000
1.000
0.833
0.672
0.881
1.000
2.405
0.939
0.390
52.325
12.356
120.000
0.833
1.000
0.594
0.827
1.000
0.437
0.910
0.378
50.350
12.239
120.000
0.672
0.594
1.000
0.696
1.000
0.159
0.834
0.347
7348 F Chapter 86: The SCORE Procedure
Figure 86.2 lists the first two observations of the original SAS data set (Schools).
Figure 86.2 First Two Observations of the Schools Data Set
First Two Observations of the DATA= Data Set from PROC SCORE
Obs
1
2
Type
English
p
p
52
42
Math
55
49
Biology
45
40
Figure 86.3 lists the first two observations of the output data set New created by PROC SCORE.
Figure 86.3 Listing of the New Data Set
First Two Observations of the OUT= Data Set from PROC SCORE
Obs
1
2
Type
Factor1
p
p
-0.17604
-0.80294
The score variable Factor1 in the New data set is named according to the value of the _NAME_ variable in the
Scores data set. The values of the variable Factor1 are computed as follows: the DATA= data set variables
are standardized using the same means and standard deviations that PROC FACTOR used when extracting
the factors because the Scores data set contains observations with _TYPE_=’MEAN’ and _TYPE_=’STD’.
Note that in order to correctly use standardized scoring coefficients created by other procedures such as
PROC FACTOR in this example, the data to be scored must be standardized in the same way that the data
were standardized when the scoring coefficients were computed. Otherwise, the resulting scores might be
incorrect. PROC SCORE does this automatically if the SCORE= data set is the original OUTSTAT= data set
output from the procedure creating the scoring coefficients.
These standardized variables are then multiplied by their respective standardized scoring coefficients from
the data set Scores. These products are summed over all three variables, and the sum is the value of the new
variable Factor1. The first two values of the scored variable Factor1 are obtained as follows:
.52
55:525/
.55 52:325/
.45 50:350/
0:390 C
0:378 C
0:347 D
12:949
12:356
12:239
0:17604
.42
55:525/
.49 52:325/
.40 50:350/
0:390 C
0:378 C
0:347 D
12:949
12:356
12:239
0:80294
Getting Started: SCORE Procedure F 7349
The following statements request that the SGPLOT procedure produce a horizontal bar chart of the variable
Type. The length of each bar represents the mean of the variable Factor1.
title 'Mean Score of Variable Factor1 by Each Type of Schools';
proc sgplot data=New;
hbar type / stat = mean response=Factor1;
run;
Figure 86.4 displays the mean score of the variable Factor1 for each of the three school types. For private
schools (Type=p), the average value of the variable Factor1 is 0.384, while for state-run schools the average
values are much lower. The state-run urban schools (Type=u) have the lowest mean value of –0.202, and the
state-run rural schools (Type=r) have a mean value of –0.183.
Figure 86.4 Bar Chart of School Type
7350 F Chapter 86: The SCORE Procedure
Syntax: SCORE Procedure
The following statements are available in the SCORE procedure:
PROC SCORE DATA=SAS-data-set < options > ;
BY variables ;
ID variables ;
VAR variables ;
The only required statement is the PROC SCORE statement. The BY, ID, and VAR statements are described
following the PROC SCORE statement.
PROC SCORE Statement
PROC SCORE DATA=SAS-data-set < options > ;
The PROC SCORE statement invokes the SCORE procedure. Table 86.1 summarizes the options available in
the PROC SCORE statement.
Table 86.1 PROC SCORE Statement Options
Option
Description
DATA=
NOSTD
OUT=
PREDICT
RESIDUAL
SCORE=
TYPE=
Names the input SAS data set containing the raw data to score
Suppresses centering and scaling of the raw data
Specifies the name of the SAS data set
Treats coefficients of –1 as 0
Reverses the sign of each score
Names the data set containing the scoring coefficients
Specifies the observations that contain scoring coefficients
You can specify the following options in the PROC SCORE statement.
DATA=SAS-data-set
names the input SAS data set containing the raw data to score. This option is required.
NOSTD
suppresses centering and scaling of the raw data. Ordinarily, if PROC SCORE finds _TYPE_=‘MEAN’,
_TYPE_= ‘USCORE’, _TYPE_=‘USTD’, or _TYPE_=‘STD’ observations in the SCORE= data set,
the procedure uses these to standardize the raw data before scoring.
OUT=SAS-data-set
specifies the name of the SAS data set created by PROC SCORE. If you want to create a SAS data set
in a permanent library, you must specify a two-level name. For more information about permanent
libraries and SAS data sets, see SAS Language Reference: Concepts. If the OUT= option is omitted,
PROC SCORE still creates an output data set and automatically names it according to the DATAn
convention, as if you omitted a data set name in a DATA statement.
BY Statement F 7351
PREDICT
specifies that PROC SCORE should treat coefficients of –1 in the SCORE= data set as 0. In regression
applications, the dependent variable is coded with a coefficient of –1. Applied directly to regression
results, PROC SCORE produces negative residuals (see the description of the RESIDUAL option,
which follows); the PREDICT option produces predicted values instead.
RESIDUAL
reverses the sign of each score. Applied directly to regression results, PROC SCORE produces negative
residuals (PREDICT–ACTUAL); the RESIDUAL option produces positive residuals (ACTUAL–
PREDICT) instead.
SCORE=SAS-data-set
names the data set containing the scoring coefficients. If you omit the SCORE= option, the most
recently created SAS data set is used. This data set must have two special variables: _TYPE_ and
either _NAME_ or _MODEL_.
TYPE=name or ‘string’
specifies the observations in the SCORE= data set that contain scoring coefficients. The TYPE=
procedure option is unrelated to the data set option that has the same name. PROC SCORE examines
the values of the special variable _TYPE_ in the SCORE= data set. When the value of _TYPE_ matches
TYPE=name, the observation in the SCORE= data set is used to score the raw data in the DATA= data
set.
If you omit the TYPE= option, scoring coefficients are read from observations with either
_TYPE_=’SCORE’ or _TYPE_=’USCORE’. Because the default for PROC SCORE is TYPE=SCORE,
you need not specify the TYPE= option for factor scoring or for computing scores from OUTSTAT=
data sets from the CANCORR, CANDISC, PRINCOMP, or VARCLUS procedure. When you use
regression coefficients from PROC REG, specify TYPE=PARMS.
The maximum length of the argument specified in the TYPE= option depends on the length defined
by the VALIDVARNAME= SAS system option. For additional information, see SAS System Options:
Reference.
Note that the TYPE= option setting is not case sensitive. For example, the two option settings
TYPE=’MyScore’ and TYPE=’myscore’ are equivalent.
BY Statement
BY variables ;
You can specify a BY statement with PROC SCORE to obtain separate analyses of observations in groups
that are defined by the BY variables. When a BY statement appears, the procedure expects the input data
set to be sorted in order of the BY variables. If you specify more than one BY statement, only the last one
specified is used.
If your input data set is not sorted in ascending order, use one of the following alternatives:
• Sort the data by using the SORT procedure with a similar BY statement.
7352 F Chapter 86: The SCORE Procedure
• Specify the NOTSORTED or DESCENDING option in the BY statement for the SCORE procedure.
The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged
in groups (according to values of the BY variables) and that these groups are not necessarily in
alphabetical or increasing numeric order.
• Create an index on the BY variables by using the DATASETS procedure (in Base SAS software).
For more information about BY-group processing, see the discussion in SAS Language Reference: Concepts.
For more information about the DATASETS procedure, see the discussion in the Base SAS Procedures Guide.
You can specify a BY statement to apply separate groups of scoring coefficients to the entire DATA= data set.
If the DATA= data set does not contain any of the BY variables, the entire DATA= data set is scored by each
BY group of scoring coefficients in the SCORE= data set.
If the DATA= data set contains some but not all of the BY variables, or if some BY variables do not have the
same type or length in the DATA= data set as in the SCORE= data set, then PROC SCORE prints an error
message and stops.
If all the BY variables appear in the DATA= data set with the same type and length as in the SCORE= data
set, then each BY group in the DATA= data set is scored using scoring coefficients from the corresponding
BY group in the SCORE= data set. The BY groups in the DATA= data set must be in the same order as in the
SCORE= data set. All BY groups in the DATA= data set must also appear in the SCORE= data set. If you do
not specify the NOTSORTED option, some BY groups can appear in the SCORE= data set but not in the
DATA= data set; such BY groups are not used in computing scores.
ID Statement
ID variables ;
The ID statement identifies variables from the DATA= data set to be included in the OUT= data set. If there is
no ID statement, all variables from the DATA= data set are included in the OUT= data set. The ID variables
can be character or numeric.
VAR Statement
VAR variables ;
The VAR statement specifies the variables to be used in computing scores. These variables must be in both
the DATA= and SCORE= input data sets and must be numeric. If you do not specify a VAR statement, the
procedure uses all numeric variables in the SCORE= data set. You should almost always specify a VAR
statement with PROC SCORE because you would rarely use all the numeric variables in your data set to
compute scores.
Details: SCORE Procedure F 7353
Details: SCORE Procedure
Missing Values
If one of the scoring variables in the DATA= data set has a missing value for an observation, all the scores
have missing values for that observation. The exception to this criterion is that if the PREDICT option is
specified, the variable with a coefficient of –1 can tolerate a missing value and still produce a prediction
score. Also, a variable with a coefficient of 0 can tolerate a missing value.
If a scoring coefficient in the SCORE= data set has a missing value for an observation, the coefficient is
not used in creating the new score variable for the observation. In other words, missing values of scoring
coefficients are treated as zeros. This treatment affects only the observation in which the missing value
occurs.
Regression Parameter Estimates from PROC REG
If the SCORE= data set is an OUTEST= data set produced by PROC REG and if you specify TYPE=PARMS,
the interpretation of the new score variables depends on the PROC SCORE options chosen and the variables
listed in the VAR statement. If the VAR statement contains only the independent variables used in a model in
PROC REG, the new score variables give the predicted values. If the VAR statement contains the dependent
variables and the independent variables used in a model in PROC REG, the interpretation of the new score
variables depends on the PROC SCORE options chosen. If you omit both the PREDICT and the RESIDUAL
options, the new score variables give negative residuals (PREDICT–ACTUAL). If you specify the RESIDUAL
option, the new score variables give positive residuals (ACTUAL–PREDICT). If you specify the PREDICT
option, the new score variables give predicted values.
Unless you specify the NOINT option for PROC REG, the OUTEST= data set contains the variable Intercept.
The SCORE procedure uses the intercept value in computing the scores.
Output Data Set
PROC SCORE produces an output data set but displays no output. The output OUT= data set contains the
following variables:
• the ID variables, if any
• all variables from the DATA= data set, if no ID variables are specified
• the BY variables, if any
• the new score variables, named from the _NAME_ or _MODEL_ values in the SCORE= data set
7354 F Chapter 86: The SCORE Procedure
Computational Resources
Let
v
s
b
n
= number of variables used in computing scores
= number of new score variables
= maximum number of new score variables in a BY group
= original input value
Memory
The array storage required is approximately 8.4v C .3 C v/b C s/ bytes. When you do not use BY processing,
the array storage required is approximately 8.4v C .4 C v/s/ bytes.
Time
The time required to construct the scoring matrix is roughly proportional to vs, and the time needed to
compute the scores is roughly proportional to nvs.
Examples: SCORE Procedure
The following three examples use a subset of the Fitness data set. The complete data set is given in Chapter 83,
“The REG Procedure.”
Example 86.1: Factor Scoring Coefficients
This example shows how to use PROC SCORE with factor scoring coefficients. First, the FACTOR procedure
produces an output data set containing scoring coefficients in observations identified by _TYPE_=’SCORE’.
These data, together with the original data set Fitness, are supplied to PROC SCORE, resulting in a
data set containing scores Factor1 and Factor2. The following statements produce Output 86.1.1 through
Output 86.1.3:
/* This data set contains only the first 12 observations
*/
/* from the full data set used in the chapter on PROC REG. */
data Fitness;
input Age Weight Oxygen RunTime RestPulse RunPulse @@;
datalines;
44 89.47 44.609 11.37 62 178
40 75.07 45.313 10.07 62 185
44 85.84 54.297 8.65 45 156
42 68.15 59.571 8.17 40 166
38 89.02 49.874 9.22 55 178
47 77.45 44.811 11.63 58 176
40 75.98 45.681 11.95 70 176
43 81.19 49.091 10.85 64 162
44 81.42 39.442 13.08 63 174
38 81.87 60.055 8.63 48 170
44 73.03 50.541 10.13 45 168
45 87.66 37.388 14.03 56 186
;
Example 86.1: Factor Scoring Coefficients F 7355
proc factor data=Fitness outstat=FactOut
method=prin rotate=varimax score;
var Age Weight RunTime RunPulse RestPulse;
title 'Factor Scoring Example';
run;
proc print data=FactOut;
title2 'Data Set from PROC FACTOR';
run;
proc score data=Fitness score=FactOut out=FScore;
var Age Weight RunTime RunPulse RestPulse;
run;
proc print data=FScore;
title2 'Data Set from PROC SCORE';
run;
Output 86.1.1 shows the PROC FACTOR output. The scoring coefficients for the two factors are shown at
the end of the PROC FACTOR output.
Output 86.1.1 Creating an OUTSTAT= Data Set with PROC FACTOR
Factor Scoring Example
The FACTOR Procedure
Input Data Type
Number of Records Read
Number of Records Used
N for Significance Tests
Raw Data
12
12
12
Factor Scoring Example
The FACTOR Procedure
Initial Factor Method: Principal Components
Prior Communality Estimates: ONE
Eigenvalues of the Correlation Matrix: Total = 5
1
2
3
4
5
Average = 1
Eigenvalue
Difference
Proportion
Cumulative
2.30930638
1.19219952
0.88222702
0.50256713
0.11369996
1.11710686
0.30997249
0.37965990
0.38886717
0.4619
0.2384
0.1764
0.1005
0.0227
0.4619
0.7003
0.8767
0.9773
1.0000
7356 F Chapter 86: The SCORE Procedure
Output 86.1.1 continued
2 factors will be retained by the MINEIGEN criterion.
Factor Pattern
Factor1
Factor2
0.29795
0.43282
0.91983
0.72671
0.81179
0.93675
-0.17750
0.28782
-0.38191
-0.23344
Age
Weight
RunTime
RunPulse
RestPulse
Variance Explained by Each Factor
Factor1
Factor2
2.3093064
1.1921995
Final Communality Estimates: Total = 3.501506
Age
Weight
RunTime
RunPulse
RestPulse
0.96628351
0.21883401
0.92893333
0.67396207
0.71349297
Factor Scoring Example
The FACTOR Procedure
Rotation Method: Varimax
Orthogonal Transformation Matrix
1
2
1
2
0.92536
-0.37908
0.37908
0.92536
Rotated Factor Pattern
Age
Weight
RunTime
RunPulse
RestPulse
Factor1
Factor2
-0.07939
0.46780
0.74207
0.81725
0.83969
0.97979
-0.00018
0.61503
-0.07792
0.09172
Example 86.1: Factor Scoring Coefficients F 7357
Output 86.1.1 continued
Variance Explained by Each Factor
Factor1
Factor2
2.1487753
1.3527306
Final Communality Estimates: Total = 3.501506
Age
Weight
RunTime
RunPulse
RestPulse
0.96628351
0.21883401
0.92893333
0.67396207
0.71349297
Factor Scoring Example
The FACTOR Procedure
Rotation Method: Varimax
Scoring Coefficients Estimated by Regression
Squared Multiple Correlations of the Variables with Each Factor
Factor1
Factor2
1.0000000
1.0000000
Standardized Scoring Coefficients
Age
Weight
RunTime
RunPulse
RestPulse
Factor1
Factor2
-0.17846
0.22987
0.27707
0.41263
0.39952
0.77600
-0.06672
0.37440
-0.17714
-0.04793
Output 86.1.2 lists the OUTSTAT= data set from PROC FACTOR. Note that observations 18 and 19 have
_TYPE_=’SCORE’. Observations 1 and 2 have _TYPE_=’MEAN’ and _TYPE_=’STD’, respectively. These
four observations are used by PROC SCORE.
7358 F Chapter 86: The SCORE Procedure
Output 86.1.2 OUTSTAT= Data Set from PROC FACTOR Reproduced with PROC PRINT
Factor Scoring Example
Data Set from PROC FACTOR
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
_TYPE_
MEAN
STD
N
CORR
CORR
CORR
CORR
CORR
COMMUNAL
PRIORS
EIGENVAL
UNROTATE
UNROTATE
TRANSFOR
TRANSFOR
PATTERN
PATTERN
SCORE
SCORE
_NAME_
Age
Weight
RunTime
RunPulse
RestPulse
Factor1
Factor2
Factor1
Factor2
Factor1
Factor2
Factor1
Factor2
Age
Weight
RunTime
RunPulse
Rest
Pulse
42.4167
2.8431
12.0000
1.0000
0.0128
0.5005
-0.0953
-0.0080
0.9663
1.0000
2.3093
0.2980
0.9368
0.9254
0.3791
-0.0794
0.9798
-0.1785
0.7760
80.5125
6.7660
12.0000
0.0128
1.0000
0.2637
0.1731
0.2396
0.2188
1.0000
1.1922
0.4328
-0.1775
-0.3791
0.9254
0.4678
-0.0002
0.2299
-0.0667
10.6483
1.8444
12.0000
0.5005
0.2637
1.0000
0.5555
0.6620
0.9289
1.0000
0.8822
0.9198
0.2878
.
.
0.7421
0.6150
0.2771
0.3744
172.917
8.918
12.000
-0.095
0.173
0.556
1.000
0.485
0.674
1.000
0.503
0.727
-0.382
.
.
0.817
-0.078
0.413
-0.177
55.6667
9.2769
12.0000
-0.0080
0.2396
0.6620
0.4853
1.0000
0.7135
1.0000
0.1137
0.8118
-0.2334
.
.
0.8397
0.0917
0.3995
-0.0479
Since the PROC SCORE statement does not contain the NOSTD option, the data in the Fitness data set
are standardized before scoring. For each variable specified in the VAR statement, the mean and standard
deviation are obtained from the FactOut data set. For each observation in the Fitness data set, the variables
are then standardized. For example, for observation 1 in the Fitness data set, the variable Age is standardized
to 0:5569 D Œ.44 42:4167/=2:8431.
After the data in the Fitness data set are standardized, the standardized values of the variables in the VAR
statement are multiplied by the matching coefficients in the FactOut data set, and the resulting products are
summed. This sum is output as a value of the new score variable.
Output 86.1.3 displays the FScore data set produced by PROC SCORE. This data set contains the variables
Age, Weight, Oxygen, RunTime, RestPulse, and RunPulse from the Fitness data set. It also contains Factor1
and Factor2, the two new score variables.
Example 86.2: Regression Parameter Estimates F 7359
Output 86.1.3 OUT= Data Set from PROC SCORE Reproduced with PROC PRINT
Factor Scoring Example
Data Set from PROC SCORE
Obs
Age
Weight
Oxygen
Run
Time
1
2
3
4
5
6
7
8
9
10
11
12
44
40
44
42
38
47
40
43
44
38
44
45
89.47
75.07
85.84
68.15
89.02
77.45
75.98
81.19
81.42
81.87
73.03
87.66
44.609
45.313
54.297
59.571
49.874
44.811
45.681
49.091
39.442
60.055
50.541
37.388
11.37
10.07
8.65
8.17
9.22
11.63
11.95
10.85
13.08
8.63
10.13
14.03
Rest
Pulse
Run
Pulse
Factor1
Factor2
178
185
156
166
178
176
176
162
174
170
168
186
0.82129
0.71173
-1.46064
-1.76087
0.55819
-0.00113
0.95318
-0.12951
0.66267
-0.44496
-1.11832
1.20836
0.35663
-0.99605
0.36508
-0.27657
-1.67684
1.40715
-0.48598
0.36724
0.85740
-1.53103
0.55349
1.05948
62
62
45
40
55
58
70
64
63
48
45
56
Example 86.2: Regression Parameter Estimates
In this example, PROC REG computes regression parameter estimates for the Fitness data. (See Example 86.1
to for more information about how to create the Fitness data set.) The parameter estimates are output to a
data set and used as scoring coefficients. For the first part of this example, PROC SCORE is used to score the
Fitness data, which are the same data used in the regression.
In the second part of this example, PROC SCORE is used to score a new data set, Fitness2. For PROC
SCORE, the TYPE= specification is PARMS, and the names of the score variables are found in the variable
_MODEL_, which gets its values from the model label. The following code produces Output 86.2.1 through
Output 86.2.3:
proc reg data=Fitness outest=RegOut;
OxyHat: model Oxygen=Age Weight RunTime RunPulse RestPulse;
title 'Regression Scoring Example';
run;
proc print data=RegOut;
title2 'OUTEST= Data Set from PROC REG';
run;
proc score data=Fitness score=RegOut out=RScoreP type=parms;
var Age Weight RunTime RunPulse RestPulse;
run;
7360 F Chapter 86: The SCORE Procedure
proc print data=RScoreP;
title2 'Predicted Scores for Regression';
run;
proc score data=Fitness score=RegOut out=RScoreR type=parms;
var Oxygen Age Weight RunTime RunPulse RestPulse;
run;
proc print data=RScoreR;
title2 'Negative Residual Scores for Regression';
run;
Output 86.2.1 shows the PROC REG output. The column labeled “Parameter Estimates” lists the parameter
estimates. These estimates are output to the RegOut data set.
Output 86.2.1 Creating an OUTEST= Data Set with PROC REG
Regression Scoring Example
The REG Procedure
Model: OxyHat
Dependent Variable: Oxygen
Number of Observations Read
Number of Observations Used
12
12
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
Corrected Total
5
6
11
509.62201
38.70060
548.32261
101.92440
6.45010
Root MSE
Dependent Mean
Coeff Var
2.53970
48.38942
5.24847
R-Square
Adj R-Sq
F Value
Pr > F
15.80
0.0021
0.9294
0.8706
Parameter Estimates
Variable
Intercept
Age
Weight
RunTime
RunPulse
RestPulse
DF
Parameter
Estimate
Standard
Error
t Value
Pr > |t|
1
1
1
1
1
1
151.91550
-0.63045
-0.10586
-1.75698
-0.22891
-0.17910
31.04738
0.42503
0.11869
0.93844
0.12169
0.13005
4.89
-1.48
-0.89
-1.87
-1.88
-1.38
0.0027
0.1885
0.4068
0.1103
0.1090
0.2176
Example 86.2: Regression Parameter Estimates F 7361
Output 86.2.2 lists the RegOut data set. Note that _TYPE_=’PARMS’ and _MODEL_=’OXYHAT’, which
are from the label in the MODEL statement in PROC REG.
Output 86.2.2 OUTEST= Data Set from PROC REG Reproduced with PROC PRINT
Regression Scoring Example
OUTEST= Data Set from PROC REG
Obs
_MODEL_
_TYPE_
_DEPVAR_
_RMSE_
Intercept
Age
1
OxyHat
PARMS
Oxygen
2.53970
151.916
-0.63045
Obs
Weight
RunTime
RunPulse
Rest
Pulse
Oxygen
1
-0.10586
-1.75698
-0.22891
-0.17910
-1
Output 86.2.3 lists the data sets created by PROC SCORE. Since the SCORE= data set does not contain
observations with _TYPE_=’MEAN’ or _TYPE_=’STD’, the data in the Fitness data set are not standardized
before scoring. The SCORE= data set contains the variable Intercept, so this intercept value is used in
computing the score. To produce the RScoreP data set, the VAR statement in PROC SCORE includes only
the independent variables from the model in PROC REG. As a result, the OxyHat variable contains predicted
values. To produce the RScoreR data set, the VAR statement in PROC SCORE includes both the dependent
variables and the independent variables from the model in PROC REG. As a result, the OxyHat variable
contains negative residuals (PREDICT–ACTUAL) as shown in Output 86.2.4. If the RESIDUAL option is
specified, the variable OxyHat contains positive residuals (ACTUAL–PREDICT). If the PREDICT option is
specified, the OxyHat variable contains predicted values.
Output 86.2.3 Predicted Scores from the OUT= Data Set Created by PROC SCORE
Regression Scoring Example
Predicted Scores for Regression
Obs
Age
Weight
Oxygen
Run
Time
Rest
Pulse
Run
Pulse
OxyHat
1
2
3
4
5
6
7
8
9
10
11
12
44
40
44
42
38
47
40
43
44
38
44
45
89.47
75.07
85.84
68.15
89.02
77.45
75.98
81.19
81.42
81.87
73.03
87.66
44.609
45.313
54.297
59.571
49.874
44.811
45.681
49.091
39.442
60.055
50.541
37.388
11.37
10.07
8.65
8.17
9.22
11.63
11.95
10.85
13.08
8.63
10.13
14.03
62
62
45
40
55
58
70
64
63
48
45
56
178
185
156
166
178
176
176
162
174
170
168
186
42.8771
47.6050
56.1211
58.7044
51.7386
42.9756
44.8329
48.6020
41.4613
56.6171
52.1299
37.0080
7362 F Chapter 86: The SCORE Procedure
Output 86.2.4 Residual Scores from the OUT= Data Set Created by PROC SCORE
Regression Scoring Example
Negative Residual Scores for Regression
Obs
Age
Weight
Oxygen
Run
Time
1
2
3
4
5
6
7
8
9
10
11
12
44
40
44
42
38
47
40
43
44
38
44
45
89.47
75.07
85.84
68.15
89.02
77.45
75.98
81.19
81.42
81.87
73.03
87.66
44.609
45.313
54.297
59.571
49.874
44.811
45.681
49.091
39.442
60.055
50.541
37.388
11.37
10.07
8.65
8.17
9.22
11.63
11.95
10.85
13.08
8.63
10.13
14.03
Rest
Pulse
Run
Pulse
OxyHat
62
62
45
40
55
58
70
64
63
48
45
56
178
185
156
166
178
176
176
162
174
170
168
186
-1.73195
2.29197
1.82407
-0.86657
1.86460
-1.83542
-0.84811
-0.48897
2.01935
-3.43787
1.58892
-0.38002
The second part of this example uses the parameter estimates to score a new data set. The following statements
produce Output 86.2.5 and Output 86.2.6:
/* The FITNESS2 data set contains observations 13-16 from */
/* the FITNESS data set used in EXAMPLE 2 in the PROC REG */
/* chapter.
*/
data Fitness2;
input Age Weight Oxygen RunTime RestPulse RunPulse;
datalines;
45 66.45 44.754 11.12 51 176
47 79.15 47.273 10.60 47 162
54 83.12 51.855 10.33 50 166
49 81.42 49.156
8.95 44 180
;
proc print data=Fitness2;
title 'Regression Scoring Example';
title2 'New Raw Data Set to be Scored';
run;
proc score data=Fitness2 score=RegOut out=NewPred type=parms
nostd predict;
var Oxygen Age Weight RunTime RunPulse RestPulse;
run;
proc print data=NewPred;
title2 'Predicted Scores for Regression';
title3 'for Additional Data from FITNESS2';
run;
Example 86.2: Regression Parameter Estimates F 7363
Output 86.2.5 lists the Fitness2 data set.
Output 86.2.5 Listing of the Fitness2 Data Set
Regression Scoring Example
New Raw Data Set to be Scored
Obs
1
2
3
4
Age
Weight
Oxygen
Run
Time
Rest
Pulse
Run
Pulse
45
47
54
49
66.45
79.15
83.12
81.42
44.754
47.273
51.855
49.156
11.12
10.60
10.33
8.95
51
47
50
44
176
162
166
180
PROC SCORE scores the Fitness2 data set by using the parameter estimates in the RegOut data set. These
parameter estimates result from fitting a regression equation to the Fitness data set. The NOSTD option is
specified, so the raw data are not standardized before scoring. (However, the NOSTD option is not necessary
here. The SCORE= data set does not contain observations with _TYPE_=’MEAN’ or _TYPE_=’STD’, so
standardization is not performed.) The VAR statement contains the dependent variables and the independent
variables used in PROC REG. In addition, the PREDICT option is specified. This combination gives predicted
values for the new score variable. The name of the new score variable is OxyHat, from the value of the
_MODEL_ variable in the SCORE= data set. Output 86.2.6 shows the data set produced by PROC SCORE.
Output 86.2.6 Predicted Scores from the OUT= Data Set Created by PROC SCORE and Reproduced
Using PROC PRINT
Regression Scoring Example
Predicted Scores for Regression
for Additional Data from FITNESS2
Obs
1
2
3
4
Age
Weight
Oxygen
Run
Time
Rest
Pulse
Run
Pulse
OxyHat
45
47
54
49
66.45
79.15
83.12
81.42
44.754
47.273
51.855
49.156
11.12
10.60
10.33
8.95
51
47
50
44
176
162
166
180
47.5507
49.7802
43.9682
47.5949
7364 F Chapter 86: The SCORE Procedure
Example 86.3: Custom Scoring Coefficients
This example uses a specially created custom scoring data set and produces Output 86.3.1 and Output 86.3.2.
The first scoring coefficient creates a variable that is Age–Weight; the second scoring coefficient evaluates
the variable RunPulse–RstPulse; and the third scoring coefficient totals all six variables. Since the scoring
coefficients data set (data set A) does not contain any observations with _TYPE_=’MEAN’ or _TYPE_=’STD’,
the data in the Fitness data set (see Example 86.1) are not standardized before scoring.
The following statements produce Output 86.3.1 and Output 86.3.2:
data A;
input _type_ $ _name_ $
Age Weight RunTime RunPulse RestPulse;
datalines;
SCORE AGE_WGT 1 -1 0 0 0
SCORE RUN_RST 0 0 0 1 -1
SCORE TOTAL
1 1 1 1 1
;
proc print data=A;
title 'Constructed Scoring Example';
title2 'Scoring Coefficients';
run;
proc score data=Fitness score=A out=B;
var Age Weight RunTime RunPulse RestPulse;
run;
proc print data=B;
title2 'Scored Data';
run;
Output 86.3.1 Custom Scoring Data Set and Scored Fitness Data: PROC PRINT
Constructed Scoring Example
Scoring Coefficients
Obs
_type_
_name_
1
2
3
SCORE
SCORE
SCORE
AGE_WGT
RUN_RST
TOTAL
Age
Weight
Run
Time
Run
Pulse
Rest
Pulse
1
0
1
-1
0
1
0
0
1
0
1
1
0
-1
1
References F 7365
Output 86.3.2 Custom Scored Fitness Data: PROC PRINT
Constructed Scoring Example
Scored Data
Obs
Age
Weight
Oxygen
Run
Time
1
2
3
4
5
6
7
8
9
10
11
12
44
40
44
42
38
47
40
43
44
38
44
45
89.47
75.07
85.84
68.15
89.02
77.45
75.98
81.19
81.42
81.87
73.03
87.66
44.609
45.313
54.297
59.571
49.874
44.811
45.681
49.091
39.442
60.055
50.541
37.388
11.37
10.07
8.65
8.17
9.22
11.63
11.95
10.85
13.08
8.63
10.13
14.03
Rest
Pulse
Run
Pulse
AGE_WGT
RUN_RST
62
62
45
40
55
58
70
64
63
48
45
56
178
185
156
166
178
176
176
162
174
170
168
186
-45.47
-35.07
-41.84
-26.15
-51.02
-30.45
-35.98
-38.19
-37.42
-43.87
-29.03
-42.66
116
123
111
126
123
118
106
98
111
122
123
130
TOTAL
384.84
372.14
339.49
324.32
369.24
370.08
373.93
361.04
375.50
346.50
340.16
388.69
References
Chaseling, J. (1996), “Standard Test Results of Students at Three Types of Schools,” Sample data, Faculty of
Environmental Sciences, Griffith University, Queensland, Australia.
Subject Index
analyzing data in groups
SCORE procedure, 7351
custom scoring coefficients, example
SCORE procedure, 7364
DOT product (SCORE), 7343
factor scoring coefficients
FACTOR procedure, 7344
SCORE procedure, 7344, 7354
matrix
multiplication (SCORE), 7343
missing values
SCORE procedure, 7353
OUT= data sets
SCORE procedure, 7353
output data set
SCORE procedure, 7350, 7353
regression coefficients
using with SCORE procedure, 7344
regression parameter estimates, example
SCORE procedure, 7359
SCORE procedure
computational resources, 7354
examples, 7345, 7354
input data set, 7350
OUT= data sets, 7353
output data set, 7350, 7353
regression parameter estimates from REG
procedure, 7353
scoring coefficients, 7344
score variables
interpretation (SCORE), 7353
scoring coefficients (SCORE), 7343
standardizing
raw data (SCORE), 7344
Syntax Index
BY statement
SCORE procedure, 7351
DATA= option
PROC SCORE statement, 7350
NOSTD option
PROC SCORE statement, 7350
OUT= option
PROC SCORE statement, 7350
PREDICT option
PROC SCORE statement, 7351
PROC SCORE statement, see SCORE procedure
RESIDUAL option
PROC SCORE statement, 7351
SCORE procedure
syntax, 7350
SCORE procedure, BY statement, 7351
SCORE procedure, ID statement, 7352
SCORE procedure, PROC SCORE statement, 7350
DATA= option, 7350
NOSTD option, 7350
OUT= option, 7350
PREDICT option, 7351
RESIDUAL option, 7351
SCORE= option, 7351
TYPE= option, 7351
SCORE procedure, VAR statement, 7352
SCORE= option
PROC SCORE statement, 7351
TYPE= option
PROC SCORE statement, 7351

advertisement

Key Features

  • Multiplies values from two SAS data sets
  • Computes linear combinations of coefficients and raw data values
  • Standardizes or centers the raw data
  • Outputs a new SAS data set with scores
  • Handles missing values in both data sets
  • Supports regression parameter estimates from PROC REG
  • Provides flexible handling of scoring coefficients
  • Offers control over data standardization and score calculation
  • Creates a new SAS data set containing linear combinations of variables

Frequently Answers and Questions

What is the purpose of the SCORE procedure?
The SCORE procedure multiplies values from two SAS data sets, one containing coefficients (such as factor-scoring coefficients or regression coefficients) and the other containing raw data, to produce a new data set with scores.
How does the SCORE procedure handle missing values?
If a scoring variable in the input data set has a missing value, all scores for that observation will have missing values. However, if the PREDICT option is specified, a variable with a coefficient of -1 or 0 can tolerate a missing value.
Can I use the SCORE procedure with regression parameter estimates from PROC REG?
Yes, you can use the SCORE procedure with regression parameter estimates from PROC REG by specifying the TYPE=PARMS option. The interpretation of the new score variables depends on the options chosen and the variables listed in the VAR statement.

Related manuals

Download PDF

advertisement