As already mentioned, a different process must be used to calculate disability scores for children. Children at different ages are incredibly different, and thus have different items pertaining to them in the questionnaire.
In the Capacity section of the children questionnaire, items are applied to all children aged 2 to 17, without age-specific filters. In this case, a multigroup analysis must be performed. This is essentially the same as performing a Rasch Model as for adults, however we alert the program to the fact that we have very different groups responding to the questionnaire–that is, different age groups. We use the age groups 2 to 4, 5 to 9, and 10 to 17.
In the Performance section of the children questionnaire, there is a set of common questions for all children and a set of questions with age-specific filters. In this case, a multigroup analysis is first performed on all the common items, as with the Capacity questions. Then this multigroup model is used to calibrate an additional score using all of the age-specific questions. This is called “anchoring”–we are anchoring the results of the age-specific questions on the results from the common set of questions. This must be done in order to make the disability scores obtained for children comparable across age groups.
The whomds
package also contains functions to perform
Rasch Anaylsis for children.
To perform Rasch Analysis for children, only one function needs to be
used: rasch_mds_children()
. This is a wrapper function,
like rasch_mds()
for adults.
rasch_mds_children()
utilizes all the other
rasch_*()
functions to perform the analysis. These other
rasch_*()
functions used for children are:
rasch_df_nest()
rasch_drop()
rasch_factor()
rasch_model_children()
rasch_quality_children()
rasch_quality_children_print()
rasch_recode()
rasch_rescale_children()
rasch_split()
rasch_split_age()
rasch_testlet()
You only need to use the function rasch_mds_children()
.
But if you wanted to perform your analysis in a more customized way, you
could work with the internal functions directly.
After each iteration of the Rasch Model, the codes above produce a variety of files that will tell you how well the data fit the assumptions of the model at this iteration. The most important files are the ones below. We will explain each in detail below.
LID_plot_Multigroup.pdf
and
LID_above_0.2_Multigroup.csv
- shows correlated items for
multigroup modelLID_plotAge2to4.pdf
and
LID_above_0.2_Age2to4.csv
- shows correlated items for ages
2 to 4 for anchored modelLID_plotAge5to9.pdf
and
LID_above_0.2_Age5to9.csv
- shows correlated items for ages
5 to 9 for anchored modelLID_plotAge10to17.pdf
and
LID_above_0.2_Age10to17.csv
- shows correlated items for
ages 10 to 17 for anchored modelScreePlot_Multigroup_PCM2.pdf
- scree plot for
multigroup modelScreePlot_Anchored_PCM2_Age2to4.pdf
- scree plot for
ages 2 to 4 for anchored modelScreePlot_Anchored_PCM2_Age5to9.pdf
- scree plot for
ages 5 to 9 for anchored modelScreePlot_Anchored_PCM2_Age10to17.pdf
- scree plot for
ages 10 to 17 for anchored modelWrightMap_multigroup.pdf
- locations of persons and
items for multigroup modelWrightMap_anchored_Age2to4.pdf
- locations of persons
and items for ages 2 to 4 for anchored modelWrightMap_anchored_Age5to9.pdf
- locations of persons
and items for ages 5 to 9 for anchored modelWrightMap_anchored_Age10to17.pdf
- locations of persons
and items for 10 to 17 for anchored modelPCM2_itemfit_multigroup.xlsx
- item fit for multigroup
modelPCM2_itemfit_anchored.xlsx
- item fit for anchored
modelPCM2_WLE_multigroup.txt
- reliability of model for
multigroup modelPCM2_WLE_anchored.txt
- reliability of model for
anchored modelrasch_mds_children()
In this section we will examine each argument of the function
rasch_mds_children()
. The help file for the function also
describes the arguments. Help files for functions in R
can
be accessed by writing ?
before the function name, like
so:
Below is an example of a use of the function
rasch_mds_children()
:
rasch_mds_children(df = df_children,
vars_id = "HHID",
vars_group = "age_cat",
vars_metric_common = paste0("child", c(1:10)),
vars_metric_grouped = list(
"Age2to4" = paste0("child", c(12,15,16,19,20,24,25)),
"Age5to9" = paste0("child", c(11,13,14,17,18,20,21,22,23,24,25,27)),
"Age10to17" = paste0("child", c(11,13,14,17,18,20,21,22,23,25,26,27))),
TAM_model = "PCM2",
resp_opts = 1:5,
has_at_least_one = 4:5,
max_NA = 10,
print_results = TRUE,
path_parent = "~/Desktop/",
model_name = "Start",
testlet_strategy = NULL,
recode_strategy = NULL,
drop_vars = NULL,
split_strategy = NULL,
comment = "Initial run")
The first argument is df
. This argument is for the data.
The data object should be individual survey data, with one row per
individual. Here the data is stored in an object
df_children
, which is a data set included in the
whomds
package. To find out more about
df_children
please look at its help file by running:
?df_children
The next argument is vars_id
, which is the name of the
column used to uniquely identify individuals. Here the ID column is
called HHID
.
The next argument is vars_group
, which is a string with
the column name identifying groups, specifically age groups in this
case. Here the age group column is age_cat
.
The next two arguments specify the variables used to build the metric.
vars_metric_common
is a character vector with the names
of the items that every individual in the same answers.
vars_metric_grouped
is a named list of character vectors
with the items to use in the Rasch Analysis per group from the variable
specified in vars_group
. The list should have names
corresponding to the different groups (for example,
Age2to4
, Age5to9
and Age10to17
),
and contain character vectors of the corrsponding items for each
group.
If only vars_metric_common
is specified, then a
multigroup analysis will be performed. If
vars_metric_common
and vars_metric_grouped
are
both specified, then an anchored analysis will be performed.
vars_metric_grouped
cannot be specified on its own.
The next argument is TAM_model
. This is a string that
identifies which Item Response Theory model to use. This value is passed
to the function from the TAM
package used to calculate the
Rasch Model. The default is "PCM2"
, which specifies the
Rasch Model.
The next argument is resp_opts
. This is a numeric vector
with the possible response options for vars_metric_common
and vars_metric_grouped
. In this survey, the questions have
response options 1 to 5. So here resp_opts
is equal to a
numeric vector of length 5 with the values 1
,
2
, 3
, 4
and 5
.
The next argument is has_at_least_one
. This is a numeric
vector with the response options that a respondent must have at least
one of in order to be included in the metric calculation. This argument
is required because often Rasch Analysis of children data is more
difficult because of the extreme skewness of the responses. For this
reason, it is often advisable to build a scale only with the respondents
on the more severe end of the disability continuum, for example only
with those respondents who answered 4 or 5 to at least one question. By
specifying has_at_least_one
, the function will only build a
scale among those who answered has_at_least_one
in at least
one question from vars_metric_common
and
vars_metric_grouped
. The default is 4:5
, which
means the scale will be created only with children who answered 4 or 5
to at least one question in vars_metric_common
and
vars_metric_grouped
. The scores created can be reunited
with the excluded children post-hoc.
The next argument is max_NA
. This is a numeric value for
the maximum number of missing values allowed for an individual to be
still be considered in the analysis. Rasch Analysis can handle
individuals having a few missing values, but too many will cause
problems in the analysis. In general all individuals in the sample
should have fewer than 15% missing values. Here max_NA
is
set to 10
, meaning individuals are allowed to have a
maximum of ten missing values out of the questions in
vars_metric_common
and vars_metric_grouped
to
still be included in the analysis.
The next argument is print_results
, which is either
TRUE
or FALSE
. When it is TRUE
,
files will be saved onto your computer with results from the Rasch
iteration. When it is FALSE
, files will not be saved.
The next argument is path_parent
. This is a string with
the path to the folder where the results of multiple models will be
saved, assuming print_results
is TRUE
. The
folder in path_parent
will then contain separate folders
with the names specified in model_name
at each iteration.
In the function call above, all results will be saved on the Desktop.
Note that when writing paths for R
, the slashes should all
be: /
(NOT \
). Be sure to include a final
/
on the end of the path.
The next argument is model_name
. This is equal to a
string where you give a name of the model you are running. This name
will be used as the name of the folder where all the output will be
saved on your computer, if print_results
is
TRUE
. The name you give should be short but informative.
For example, you may call your first run “Start”, as it is called here,
if you create a testlet in your second run perhaps you can call it
“Testlet1”, etc. Choose whatever will be meaningful to you.
The next arguments are testlet_strategy
,
recode_strategy
, drop_vars
and
split_strategy
. These are arguments that control how the
data is used in each iteration of the Rasch Model. These are specified
in the same way as for adult models, as described in the above sections.
Here they are all set to NULL
, which means they are not
used in the iteration of Rasch Analysis shown here.
The last argument is comment
. This is equal to a string
where you can write some free-text information about the current
iteration so that when you are looking at the results later you can
remember what you did and why you did it. It is better to be more
detailed, because you will forget why you chose to run the model in this
particular way. This comment will be saved in a file
Comment.txt
. Here the comment is just
"Initial run"
.
We have already discussed differences in the command required to
compute the Rasch Model for children. To run this model, a different
R
package is available: TAM
. This package can
be more complicated to use than eRm
(the package used for
adults), but it allows for greater flexibility, and thus allows us to
compute the multigroup and anchored analyses that we require.
With the TAM
package, we are also able to compute
different types of models. By default the whomds
package
computes results using the Partial Credit Model (the normal polytomous
Rasch Model). The PCM is specified in TAM
with the argument
TAM_model
set to "PCM2"
.
Another option for TAM_model
is "GPCM"
,
which tells the program to calculate a Generalized Partial Credit Model
(GPCM). The GPCM differs from the PCM in that relaxes the uniform
discriminating power of items. In other words, the GPCM allows for some
items to be better able to identify people’s abilities than other
items.
We focus on PCM because we can obtain the scale we seek with this model.
As mentioned previously, generally the distribution of children’s
answers is much more heavily skewed towards the “no disability” end of
the scale, so it is harder to build a quality metric. Your model
will be greatly improved if you only build it among children who
indicate having some sort of level of problems in at least one of the
domains. This is why the default for the argument
has_at_least_one
of rasch_children()
is
4:5
. We recommend keeping this default.
Collapsing response options may also improve the quality of your scale. We also recommend trying to build a scale where all items are recoded to 0,1,1,2,2–in other words, collapse the 2nd and 3rd response options and the 3rd and the 4th response options.
The output from the iterations of the Rasch Model for children using the WHO codes is very similar to that of the adult models. However, for the anchored models (for Performance) in particular, for most types of files there is a separate file for each age group. The quality of the model may vary across age groups, and all the information must be assessed simultaneously in order to determine how to adjust the data.
There are a couple of key differences we will highlight. In order to
assess the model fit (targeting) of the model, open the file
PCM2_WLE.txt
and analyze the number after
"WLE Reliability = "
. WLE stands for “weighted likelihood
estimator”, and it corresponds to the PSI we used to assess model fit
for the adult models. Similarly, for good model fit, we would like to
see this number be at least greater than 0.7, ideally greater than 0.8
or more.
The person-item map is also different for the children model. The
TAM
package works with another package called
WrightMap
to produce, aptly called, Wright Maps. These are
very similar to the person item maps we used previously, except the type
of thresholds displayed are different. The Wright Map displays the
“Thurstonian Thresholds”, and these thresholds by definition will never
be disordered. Hence is reasonable to generally ignore recoding response
options for the children, unless you would like to collapse empty
response options or attempt to improve the spread of the threshold
locations to better target the persons’ abilities.
As with adults, the new rescaled scores for children are located in
the file Data_final.csv
.
Go to the Best Practices in Rasch Analysis vignette to read more about general rules to follow.