Title: | Models of Decision Confidence and Measures of Metacognition |
---|---|
Description: | Provides fitting functions and other tools for decision confidence and metacognition researchers, including meta-d'/d', often considered to be the gold standard to measure metacognitive efficiency, and information-theoretic measures of metacognition. Also allows to fit several static models of decision making and confidence. |
Authors: | Manuel Rausch [aut, cre] |
Maintainer: | Manuel Rausch <[email protected]> |
License: | GPL(>=3) |
Version: | 0.2.1 |
Built: | 2025-03-04 10:27:45 UTC |
Source: | https://github.com/manuelrausch/statconfr |
estimateMetaI
estimates meta-, an information-theoretic
measure of metacognitive sensitivity proposed by Dayan (2023), as well as
similar derived measures, including meta-
and Meta-
.
These are different normalizations of meta-
:
Meta- normalizes by the meta-
that would be
expected from an underlying normal distribution with the same
sensitivity.
Meta- is a variant of meta-
not discussed by Dayan
(2023) which normalizes by the meta-
that would be expected from
an underlying normal distribution with the same accuracy (this is
similar to the sensitivity approach but without considering variable
thresholds).
Meta- normalizes by the maximum amount of meta-
which would be reached if all uncertainty about the stimulus was removed.
normalizes meta-
by the range of its possible
values and therefore scales between 0 and 1. RMI is a novel measure not discussed by Dayan (2023).
All measures can be calculated with a bias-reduced variant for which the observed frequencies are taken as underlying probability distribution to estimate the sampling bias. The estimated bias is then subtracted from the initial measures. This approach uses Monte-Carlo simulations and is therefore not deterministic (values can vary from one evaluation of the function to the next). However, this is a simple way to reduce the bias inherent in these measures.
estimateMetaI(data, bias_reduction = TRUE)
estimateMetaI(data, bias_reduction = TRUE)
data |
a
|
bias_reduction |
|
It is assumed that a classifier (possibly a human being performing a discrimination task)
or an algorithmic classifier in a classification application,
makes a binary prediction about a true state of the world
and gives a confidence rating
.
Meta-
is defined as the mutual information between the confidence and
accuracy and is calculated as the transmitted information minus the
minimal information given the accuracy,
This is equivalent to Dayan's formulation where meta-I is the information that confidence transmits about the correctness of a response,
Meta- is expressed in bits, i.e. the log base is 2).
The other measures are different normalizations of meta-
and are unitless.
It should be noted that Dayan (2023) pointed out that a liberal or
conservative use of the confidence levels will affected the mutual
information and thus influence meta-I.
a data.frame
with one row for each subject and the following
columns:
participant
is the participant ID,
meta_I
is the estimated meta- value (expressed in bits, i.e. log base is 2),
meta_Ir1
is meta-,
meta_Ir1_acc
is meta-,
meta_Ir2
is meta-, and
RMI
is RMI.
Sascha Meyen, [email protected]
Dayan, P. (2023). Metacognitive Information Theory. Open Mind, 7, 392–411. doi:10.1162/opmi_a_00091
# 1. Select two subjects from the masked orientation discrimination experiment data <- subset(MaskOri, participant %in% c(1:2)) head(data) # 2. Calculate meta-I measures with bias reduction (this may take 10 s per subject) metaIMeasures <- estimateMetaI(data) # 3. Calculate meta-I measures for all participants without bias reduction (much faster) metaIMeasures <- estimateMetaI(MaskOri, bias_reduction = FALSE) metaIMeasures
# 1. Select two subjects from the masked orientation discrimination experiment data <- subset(MaskOri, participant %in% c(1:2)) head(data) # 2. Calculate meta-I measures with bias reduction (this may take 10 s per subject) metaIMeasures <- estimateMetaI(data) # 3. Calculate meta-I measures for all participants without bias reduction (much faster) metaIMeasures <- estimateMetaI(MaskOri, bias_reduction = FALSE) metaIMeasures
The fitConf
function fits the parameters of one static model of decision confidence,
provided by the model
argument, to binary choices and confidence judgments.
See Details for the mathematical specification of the implemented models and
their parameters.
Parameters are fitted using a maximum likelihood estimation method with a
initial grid search to find promising starting values for the optimization.
In addition, several measures of model fit (negative log-likelihood, BIC, AIC, and AICc)
are computed, which can be used for a quantitative model evaluation.
fitConf(data, model = "SDT", nInits = 5, nRestart = 4)
fitConf(data, model = "SDT", nInits = 5, nRestart = 4)
data |
a
|
model |
|
nInits |
|
nRestart |
|
The fitting routine first performs a coarse grid search to find promising
starting values for the maximum likelihood optimization procedure. Then the best nInits
parameter sets found by the grid search are used as the initial values for separate
runs of the Nelder-Mead algorithm implemented in optim
.
Each run is restarted nRestart
times.
The computational models are all based on signal detection theory (Green & Swets, 1966). It is assumed
that participants select a binary discrimination response about a stimulus
.
Both
and
can be either -1 or 1.
is considered correct if
.
In addition, we assume that there are
different levels of stimulus discriminability
in the experiment, i.e. a physical variable that makes the discrimination task easier or harder.
For each level of discriminability, the function fits a different discrimination
sensitivity parameter
. If there is more than one sensitivity parameter,
we assume that the sensitivity parameters are ordered such as
.
The models assume that the stimulus generates normally distributed sensory evidence
with mean
and variance of 1. The sensory evidence
is compared to a decision
criterion
to generate a discrimination response
, which is 1, if
exceeds
and -1 else.
To generate confidence, it is assumed that the confidence variable
is compared to another
set of criteria
, depending on the
discrimination response
to produce a
-step discrete confidence response.
The number of thresholds will be inferred from the number of steps in the
rating
column of data
. Thus, the parameters shared between all models are:
sensitivity parameters ,...,
(
: number of difficulty levels)
decision criterion
confidence criterion ,
,
...,
,
,
,...,
(
: number of confidence categories available for confidence ratings)
How the confidence variable is computed varies across the different models.
The following models have been implemented so far:
According to SDT, the same sample of sensory
evidence is used to generate response and confidence, i.e.,
and the confidence criteria span from the left and
right side of the decision criterion
(Green & Swets, 1966).
According to the model, is subject to
additive noise and assumed to be normally distributed around the decision
evidence value
with a standard deviation
(Maniscalco & Lau, 2016).
The parameter
is a free parameter.
WEV assumes that the observer combines evidence about decision-relevant features
of the stimulus with the strength of evidence about choice-irrelevant features
to generate confidence (Rausch et al., 2018). Here, we use the version of the WEV model
used by Rausch et al. (2023), which assumes that is normally
distributed with a mean of
and standard deviation
.
The parameter
quantifies the amount of unsystematic variability
contributing to confidence judgments but not to the discrimination judgments.
The parameter
represents the weight that is put on the choice-irrelevant
features in the confidence judgment.
and
are fitted in
addition to the set of shared parameters.
PDA represents the idea of on-going information accumulation after the
discrimination choice (Rausch et al., 2018). The parameter indicates the amount of additional
accumulation. The confidence variable is normally distributed with mean
and variance
.
For this model the parameter
is fitted in
addition to the set of shared parameters.
According to IG, is sampled independently
from
(Rausch & Zehetleitner, 2017).
is normally distributed with a mean of
and variance
of 1 (again as it would scale with
). The free parameter
represents the amount of information available for confidence judgment
relative to amount of evidence available for the discrimination decision and can
be smaller as well as greater than 1.
According to the version of ITG consistent
with the HMetad-method (Fleming, 2017; see Rausch et al., 2023), is sampled independently
from
from a truncated Gaussian distribution with a location parameter
of
and a scale parameter of 1. The Gaussian distribution of
is truncated in a way that it is impossible to sample evidence that contradicts
the original decision: If
, the distribution is truncated to the
right of
. If
, the distribution is truncated to the left
of
. The additional parameter
represents metacognitive efficiency,
i.e., the amount of information available for confidence judgments relative to
amount of evidence available for discrimination decisions and can be smaller
as well as greater than 1.
According to the version of the ITG consistent
with the original meta-d' method (Maniscalco & Lau, 2012, 2014; see Rausch et al., 2023),
is sampled independently from
from a truncated Gaussian distribution with a location parameter
of
and a scale parameter
of 1. If
, the distribution is truncated to the right of
.
If
, the distribution is truncated to the left of
.
The additional parameter
represents metacognitive efficiency, i.e.,
the amount of information available for confidence judgments relative to
amount of evidence available for the discrimination decision and can be smaller
as well as greater than 1.
According to logN, the same sample
of sensory evidence is used to generate response and confidence, i.e.,
just as in SDT (Shekhar & Rahnev, 2021). However, according to logN, the confidence criteria
are not assumed to be constant, but instead they are affected by noise drawn from
a lognormal distribution. In each trial,
is given
by
. Likewise,
is given by
.
is drawn from a lognormal distribution with
the location parameter
and
scale parameter
.
is a free parameter designed to
quantify metacognitive ability. It is assumed that the criterion noise is perfectly
correlated across confidence criteria, ensuring that the confidence criteria
are always perfectly ordered. Because
, ...,
,
, ...,
change from trial to trial, they are not estimated
as free parameters. Instead, we estimate the means of the confidence criteria, i.e.,
,
as free parameters.
logWEV is a combination of logN and WEV proposed by Shekhar and Rahnev (2023).
Conceptually, logWEV assumes that the observer combines evidence about decision-relevant features
of the stimulus with the strength of evidence about choice-irrelevant features (Rausch et al., 2018).
The model also assumes that noise affecting the confidence decision variable is lognormal
in accordance with Shekhar and Rahnev (2021).
According to logWEV, the confidence decision variable is equal to
.
is sampled from a lognormal distribution with a location parameter
of
and a scale parameter of
.
The parameter
quantifies the amount of unsystematic variability
contributing to confidence judgments but not to the discrimination judgments.
The parameter
represents the weight that is put on the choice-irrelevant
features in the confidence judgment.
and
are fitted in
addition to the set of shared parameters.
Gives data frame with one row and one column for each of the fitted parameters of the
selected model as well as additional information about the fit
(negLogLik
(negative log-likelihood of the final set of parameters),
k
(number of parameters), N
(number of data rows),
AIC
(Akaike Information Criterion; Akaike, 1974),
BIC
(Bayes information criterion; Schwarz, 1978), and
AICc
(AIC corrected for small samples; Burnham & Anderson, 2002))
Sebastian Hellmann, [email protected]
Manuel Rausch, [email protected]
Akaike, H. (1974). A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control, AC-19(6), 716–723.doi: 10.1007/978-1-4612-1694-0_16
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. Springer.
Fleming, S. M. (2017). HMeta-d: Hierarchical Bayesian estimation of metacognitive efficiency from confidence ratings. Neuroscience of Consciousness, 1, 1–14. doi: 10.1093/nc/nix007
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. Wiley.
Maniscalco, B., & Lau, H. (2012). A signal detection theoretic method for estimating metacognitive sensitivity from confidence ratings. Consciousness and Cognition, 21(1), 422–430.
Maniscalco, B., & Lau, H. C. (2014). Signal Detection Theory Analysis of Type 1 and Type 2 Data: Meta-d’, Response- Specific Meta-d’, and the Unequal Variance SDT Model. In S. M. Fleming & C. D. Frith (Eds.), The Cognitive Neuroscience of Metacognition (pp. 25–66). Springer. doi: 10.1007/978-3-642-45190-4_3
Maniscalco, B., & Lau, H. (2016). The signal processing architecture underlying subjective reports of sensory awareness. Neuroscience of Consciousness, 1, 1–17. doi: 10.1093/nc/niw002
Rausch, M., Hellmann, S., & Zehetleitner, M. (2018). Confidence in masked orientation judgments is informed by both evidence and visibility. Attention, Perception, and Psychophysics, 80(1), 134–154. doi: 10.3758/s13414-017-1431-5
Rausch, M., Hellmann, S., & Zehetleitner, M. (2023). Measures of metacognitive efficiency across cognitive models of decision confidence. Psychological Methods. doi: 10.31234/osf.io/kdz34
Rausch, M., & Zehetleitner, M. (2017). Should metacognition be measured by logistic regression? Consciousness and Cognition, 49, 291–312. doi: 10.1016/j.concog.2017.02.007
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. doi: 10.1214/aos/1176344136
Shekhar, M., & Rahnev, D. (2021). The Nature of Metacognitive Inefficiency in Perceptual Decision Making. Psychological Review, 128(1), 45–70. doi: 10.1037/rev0000249
Shekhar, M., & Rahnev, D. (2023). How Do Humans Give Confidence? A Comprehensive Comparison of Process Models of Perceptual Metacognition. Journal of Experimental Psychology: General. doi:10.1037/xge0001524
# 1. Select one subject from the masked orientation discrimination experiment data <- subset(MaskOri, participant == 1) head(data) # 2. Use fitting function # Fitting takes some time (about 10 minutes on an 2.8GHz processor) to run: FitFirstSbjWEV <- fitConf(data, model="WEV")
# 1. Select one subject from the masked orientation discrimination experiment data <- subset(MaskOri, participant == 1) head(data) # 2. Use fitting function # Fitting takes some time (about 10 minutes on an 2.8GHz processor) to run: FitFirstSbjWEV <- fitConf(data, model="WEV")
The fitConfModels
function fits the parameters of several computational models of decision
confidence, in binary choice tasks, specified in the model
argument, to
different subsets of one data frame, indicated by different values in the column
participant
of the data
argument.
fitConfModels
is a wrapper of the function fitConf
and calls
fitConf
for every possible combination
of model in the models
argument and sub-data frame of data
for each value
in the participant
column.
See Details for more information about the parameters.
Parameters are fitted using a maximum likelihood estimation method with a
initial grid search to find promising starting values for the optimization.
In addition, several measures of model fit (negative log-likelihood, BIC, AIC, and AICc)
are computed, which can be used for a quantitative model evaluation.
fitConfModels(data, models = "all", nInits = 5, nRestart = 4, .parallel = FALSE, n.cores = NULL)
fitConfModels(data, models = "all", nInits = 5, nRestart = 4, .parallel = FALSE, n.cores = NULL)
data |
a
|
models |
|
nInits |
|
nRestart |
|
.parallel |
|
n.cores |
|
The provided data
argument is split into subsets according to the values of
the participant
column. Then for each subset and each model in the models
argument, the parameters of the respective model are fitted to the data subset.
The fitting routine first performs a coarse grid search to find promising
starting values for the maximum likelihood optimization procedure. Then the best nInits
parameter sets found by the grid search are used as the initial values for separate
runs of the Nelder-Mead algorithm implemented in optim
.
Each run is restarted nRestart
times.
The computational models are all based on signal detection theory (Green & Swets, 1966). It is assumed
that participants select a binary discrimination response about a stimulus
.
Both
and
can be either -1 or 1.
is considered correct if
.
In addition, we assume that there are
different levels of stimulus discriminability
in the experiment, i.e. a physical variable that makes the discrimination task easier or harder.
For each level of discriminability, the function fits a different discrimination
sensitivity parameter
. If there is more than one sensitivity parameter,
we assume that the sensitivity parameters are ordered such as
.
The models assume that the stimulus generates normally distributed sensory evidence
with mean
and variance of 1. The sensory evidence
is compared to a decision
criterion
to generate a discrimination response
, which is 1, if
exceeds
and -1 else.
To generate confidence, it is assumed that the confidence variable
is compared to another
set of criteria
, depending on the
discrimination response
to produce a
-step discrete confidence response.
The number of thresholds will be inferred from the number of steps in the
rating
column of data
.
Thus, the parameters shared between all models are:
sensitivity parameters ,...,
(
: number of difficulty levels)
decision criterion
confidence criterion ,
,
...,
,
,
,...,
(
: number of confidence categories available for confidence ratings)
How the confidence variable is computed varies across the different models.
The following models have been implemented so far:
According to SDT, the same sample of sensory
evidence is used to generate response and confidence, i.e.,
and the confidence criteria span from the left and
right side of the decision criterion
(Green & Swets, 1966).
According to the model, is subject to
additive noise and assumed to be normally distributed around the decision
evidence value
with a standard deviation
(Maniscalco & Lau, 2016).
is an additional free parameter.
WEV assumes that the observer combines evidence about decision-relevant features
of the stimulus with the strength of evidence about choice-irrelevant features
to generate confidence (Rausch et al., 2018). Thus, the WEV model assumes that is normally
distributed with a mean of
and standard deviation
.
The standard deviation quantifies the amount of unsystematic variability
contributing to confidence judgments but not to the discrimination judgments.
The parameter
represents the weight that is put on the choice-irrelevant
features in the confidence judgment.
and
are fitted in
addition to the set of shared parameters.
PDA represents the idea of on-going information accumulation after the
discrimination choice (Rausch et al., 2018). The parameter indicates the amount of additional
accumulation. The confidence variable is normally distributed with mean
and variance
.
For this model the parameter
is fitted in addition to the shared
parameters.
According to IG, is sampled independently
from
(Rausch & Zehetleitner, 2017).
is normally distributed with a mean of
and variance
of 1 (again as it would scale with
). The additional parameter
represents the amount of information available for confidence judgment
relative to amount of evidence available for the discrimination decision and can
be smaller as well as greater than 1.
According to the version of ITG consistent
with the HMetad-method (Fleming, 2017; see Rausch et al., 2023), is sampled independently
from
from a truncated Gaussian distribution with a location parameter
of
and a scale parameter of 1. The Gaussian distribution of
is truncated in a way that it is impossible to sample evidence that contradicts
the original decision: If
, the distribution is truncated to the
right of
. If
, the distribution is truncated to the left
of
. The additional parameter
represents metacognitive efficiency,
i.e., the amount of information available for confidence judgments relative to
amount of evidence available for discrimination decisions and can be smaller
as well as greater than 1.
According to the version of the ITG consistent
with the original meta-d' method (Maniscalco & Lau, 2012, 2014; see Rausch et al., 2023),
is sampled independently from
from a truncated Gaussian distribution with a location parameter
of
and a scale parameter
of 1. If
, the distribution is truncated to the right of
.
If
, the distribution is truncated to the left of
.
The additional parameter
represents metacognitive efficiency, i.e.,
the amount of information available for confidence judgments relative to
amount of evidence available for the discrimination decision and can be smaller
as well as greater than 1.
According to logN, the same sample
of sensory evidence is used to generate response and confidence, i.e.,
just as in SDT (Shekhar & Rahnev, 2021). However, according to logN, the confidence criteria
are not assumed to be constant, but instead they are affected by noise drawn from
a lognormal distribution. In each trial,
is given
by
. Likewise,
is given by
.
is drawn from a lognormal distribution with
the location parameter
and
scale parameter
.
is a free parameter designed to
quantify metacognitive ability. It is assumed that the criterion noise is perfectly
correlated across confidence criteria, ensuring that the confidence criteria
are always perfectly ordered. Because
, ...,
,
, ...,
change from trial to trial, they are not estimated
as free parameters. Instead, we estimate the means of the confidence criteria, i.e.,
,
as free parameters.
logWEV is a combination of logN and WEV proposed by Shekhar and Rahnev (2023).
Conceptually, logWEV assumes that the observer combines evidence about decision-relevant features
of the stimulus with the strength of evidence about choice-irrelevant features (Rausch et al., 2018).
The model also assumes that noise affecting the confidence decision variable is lognormal
in accordance with Shekhar and Rahnev (2021).
According to logWEV, the confidence decision variable is is equal to
.
is sampled from a lognormal distribution with a location parameter
of
and a scale parameter of
.
The parameter
quantifies the amount of unsystematic variability
contributing to confidence judgments but not to the discrimination judgments.
The parameter
represents the weight that is put on the choice-irrelevant
features in the confidence judgment.
and
are fitted in
addition to the set of shared parameters.
Gives data.frame
with one row for each combination of model and
participant. There are different columns for the model, the participant ID, and one
one column for each estimated model parameter (parameters
not present in a specific model are filled with NAs).
Additional information about the fit is provided in additional columns:
negLogLik
(negative log-likelihood of the best-fitting set of parameters),
k
(number of parameters),
N
(number of trials),
AIC
(Akaike Information Criterion; Akaike, 1974),
BIC
(Bayes information criterion; Schwarz, 1978),
AICc
(AIC corrected for small samples; Burnham & Anderson, 2002)
If length(models) > 1 or models == "all", there will be three additional columns:
Sebastian Hellmann, [email protected]
Manuel Rausch, [email protected]
Akaike, H. (1974). A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control, AC-19(6), 716–723.doi: 10.1007/978-1-4612-1694-0_16
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. Springer.
Fleming, S. M. (2017). HMeta-d: Hierarchical Bayesian estimation of metacognitive efficiency from confidence ratings. Neuroscience of Consciousness, 1, 1–14. doi: 10.1093/nc/nix007
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. Wiley.
Maniscalco, B., & Lau, H. (2012). A signal detection theoretic method for estimating metacognitive sensitivity from confidence ratings. Consciousness and Cognition, 21(1), 422–430.
Maniscalco, B., & Lau, H. C. (2014). Signal Detection Theory Analysis of Type 1 and Type 2 Data: Meta-d’, Response- Specific Meta-d’, and the Unequal Variance SDT Model. In S. M. Fleming & C. D. Frith (Eds.), The Cognitive Neuroscience of Metacognition (pp. 25–66). Springer. doi: 10.1007/978-3-642-45190-4_3
Maniscalco, B., & Lau, H. (2016). The signal processing architecture underlying subjective reports of sensory awareness. Neuroscience of Consciousness, 1, 1–17. doi: 10.1093/nc/niw002
Rausch, M., Hellmann, S., & Zehetleitner, M. (2018). Confidence in masked orientation judgments is informed by both evidence and visibility. Attention, Perception, and Psychophysics, 80(1), 134–154. doi: 10.3758/s13414-017-1431-5
Rausch, M., Hellmann, S., & Zehetleitner, M. (2023). Measures of metacognitive efficiency across cognitive models of decision confidence. Psychological Methods. doi: 10.31234/osf.io/kdz34
Rausch, M., & Zehetleitner, M. (2017). Should metacognition be measured by logistic regression? Consciousness and Cognition, 49, 291–312. doi: 10.1016/j.concog.2017.02.007
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. doi: 10.1214/aos/1176344136
Shekhar, M., & Rahnev, D. (2021). The Nature of Metacognitive Inefficiency in Perceptual Decision Making. Psychological Review, 128(1), 45–70. doi: 10.1037/rev0000249
Shekhar, M., & Rahnev, D. (2023). How Do Humans Give Confidence? A Comprehensive Comparison of Process Models of Perceptual Metacognition. Journal of Experimental Psychology: General. doi:10.1037/xge0001524
# 1. Select two subjects from the masked orientation discrimination experiment data <- subset(MaskOri, participant %in% c(1:2)) head(data) # 2. Fit some models to each subject of the masked orientation discrimination experiment # Fitting several models to several subjects takes quite some time # (about 10 minutes per model fit per participant on a 2.8GHz processor # with the default values of nInits and nRestart). # If you want to fit more than just two subjects, # we strongly recommend setting .parallel=TRUE Fits <- fitConfModels(data, models = c("SDT", "ITGc"), .parallel = FALSE)
# 1. Select two subjects from the masked orientation discrimination experiment data <- subset(MaskOri, participant %in% c(1:2)) head(data) # 2. Fit some models to each subject of the masked orientation discrimination experiment # Fitting several models to several subjects takes quite some time # (about 10 minutes per model fit per participant on a 2.8GHz processor # with the default values of nInits and nRestart). # If you want to fit more than just two subjects, # we strongly recommend setting .parallel=TRUE Fits <- fitConfModels(data, models = c("SDT", "ITGc"), .parallel = FALSE)
This function computes the measures for metacognitive sensitivity, meta-d',
and metacognitive efficiency, meta-d'/d' (Maniscalco and Lau, 2012, 2014;
Fleming, 2017) to data from binary choice tasks with discrete confidence
judgments. Meta-d' and meta-d'/d' are computed using a maximum likelihood
method for each subset of the data
argument indicated by different values
in the column participant
, which can represent different subjects as well
as experimental conditions.
fitMetaDprime(data, model = "ML", nInits = 5, nRestart = 3, .parallel = FALSE, n.cores = NULL)
fitMetaDprime(data, model = "ML", nInits = 5, nRestart = 3, .parallel = FALSE, n.cores = NULL)
data |
a
|
model |
|
nInits |
|
nRestart |
|
.parallel |
|
n.cores |
|
The function computes meta-d' and meta-d'/d' either using the hypothetical signal detection model assumed by Maniscalco and Lau (2012, 2014) or the one assumed by Fleming (2014).
The conceptual idea of meta-d' is to quantify metacognition in terms of sensitivity in a hypothetical signal detection rating model describing the primary task, under the assumption that participants had perfect access to the sensory evidence and were perfectly consistent in placing their confidence criteria (Maniscalco & Lau, 2012, 2014). Using a signal detection model describing the primary task to quantify metacognition allows a direct comparison between metacognitive accuracy and discrimination performance because both are measured on the same scale. Meta-d' can be compared against the estimate of the distance between the two stimulus distributions estimated from discrimination responses, which is referred to as d': If meta-d' equals d', it means that metacognitive accuracy is exactly as good as expected from discrimination performance. Ifmeta-d' is lower than d', it means that metacognitive accuracy is suboptimal. It can be shown that the implicit model of confidence underlying the meta-d'/d' method is identical to the independent truncated Gaussian model.
The provided data
argument is split into subsets according to the values of
the participant
column. Then for each subset, the parameters of the
hypothetical signal detection model determined by the model
argument
are fitted to the data subset.
The fitting routine first performs a coarse grid search to find promising
starting values for the maximum likelihood optimization procedure. Then the best nInits
parameter sets found by the grid search are used as the initial values for separate
runs of the Nelder-Mead algorithm implemented in optim
.
Each run is restarted nRestart
times. Warning: meta-d'/d'
is only guaranteed to be unbiased from discrimination sensitivity, discrimination
bias, and confidence criteria if the data is generated according to the
independent truncated Gaussian model (see Rausch et al., 2023).
Gives data frame with one row for each participant and following columns:
model
gives the model used for the computation of meta-d' (see model
argument)
participant
is the participant ID for the respecitve row
dprime
is the discrimination sensitivity index d, calculated using a standard SDT formula
c
is the discrimination bias c, calculated using a standard SDT formula
metaD
is meta-d', discrimination sensitivity estimated from confidence judgments conditioned on the response
Ratio
is meta-d'/d', a quantity usually referred to as metacognitive efficiency.
Manuel Rausch, [email protected]
Fleming, S. M. (2017). HMeta-d: Hierarchical Bayesian estimation of metacognitive efficiency from confidence ratings. Neuroscience of Consciousness, 1, 1–14. doi: 10.1093/nc/nix007
Maniscalco, B., & Lau, H. (2012). A signal detection theoretic method for estimating metacognitive sensitivity from confidence ratings. Consciousness and Cognition, 21(1), 422–430.
Maniscalco, B., & Lau, H. C. (2014). Signal Detection Theory Analysis of Type 1 and Type 2 Data: Meta-d’, Response- Specific Meta-d’, and the Unequal Variance SDT Model. In S. M. Fleming & C. D. Frith (Eds.), The Cognitive Neuroscience of Metacognition (pp. 25–66). Springer. doi: 10.1007/978-3-642-45190-4_3
Rausch, M., Hellmann, S., & Zehetleitner, M. (2023). Measures of metacognitive efficiency across cognitive models of decision confidence. Psychological Methods. doi: 10.31234/osf.io/kdz34
# 1. Select two subject from the masked orientation discrimination experiment data <- subset(MaskOri, participant %in% c(1:2)) head(data) # 2. Fit meta-d/d for each subject in data MetaDs <- fitMetaDprime(data, model="F", .parallel = FALSE)
# 1. Select two subject from the masked orientation discrimination experiment data <- subset(MaskOri, participant %in% c(1:2)) head(data) # 2. Fit meta-d/d for each subject in data MetaDs <- fitMetaDprime(data, model="F", .parallel = FALSE)
In each trial, participants were shown a sinusoidal grating oriented either horizontally or vertically, followed by a mask after varying stimulus-onset-asynchronies. Participants were instructed to report the orientation and their degree of confidence as accurately as possible
data(MaskOri)
data(MaskOri)
A data.frame with 25920 rows representing different trials and 5 variables:
integer values as unique participant identifier
orientation of the grating (90: vertical, 0: horizontal)
participants' orientation judgment about the grating (90: vertical, 0: horizontal)
0-1 column indicating whether the discrimination response was correct (1) or not (0)
0-4 confidence rating on a continous scale binned into five categories
stimulus-onset-asynchrony in ms (i.e. time between stimulus and mask onset)
Enumeration of trials per participant
Hellmann, S., Zehetleitner, M., & Rausch, M. (2023). Simultaneous modeling of choice, confidence, and response time in visual perception. Psychological Review. 130(6), 1521–1543. doi:10.1037/rev0000411
data(MaskOri) summary(MaskOri)
data(MaskOri) summary(MaskOri)
The plotConfModelFit
function plots the predicted distribution of discrimination responses
and confidence ratings created from a data.frame
of parameters obtaind from fitConfModels
and overlays the predicted distribution over the data to which the model parameters were fitted.
plotConfModelFit(data, fitted_pars, model = NULL)
plotConfModelFit(data, fitted_pars, model = NULL)
data |
a
|
fitted_pars |
a |
model |
|
a ggplot
object with empirically observed distribution of responses and confidence ratings
as bars on the x-axis as a function of discriminability (in the rows) and stimulus
(in the columns). Superimposed on the empirical data,
the plot also shows the prediction of one selected model as dots.
Manuel Rausch, [email protected]
# 1. Fit some models to each subject of the masked orientation discrimination experiment # Normally, the fits should be created using the function fitConfModels # Fits <- fitConfModels(data, models = "WEV", .parallel = TRUE) # Here, we create the dataframe manually because fitting models takes about # 10 minutes per model fit per participant on a 2.8GHz processor. pars <- data.frame(participant = 1:16, d_1 = c(0.20, 0.05, 0.41, 0.03, 0.00, 0.01, 0.11, 0.03, 0.19, 0.08, 0.00, 0.24, 0.00, 0.00, 0.25, 0.01), d_2 = c(0.61, 0.19, 0.86, 0.18, 0.17, 0.39, 0.69, 0.14, 0.45, 0.30, 0.00, 0.27, 0.00, 0.05, 0.57, 0.23), d_3 = c(1.08, 1.04, 2.71, 2.27, 1.50, 1.21, 1.83, 0.80, 1.06, 0.68, 0.29, 0.83, 0.77, 2.19, 1.93, 0.54), d_4 = c(3.47, 4.14, 6.92, 4.79, 3.72, 3.24, 4.55, 2.51, 3.78, 2.40, 1.95, 2.55, 4.59, 4.27, 4.08, 1.80), d_5 = c(4.08, 5.29, 7.99, 5.31, 4.53, 4.66, 6.21, 4.67, 5.85, 3.39, 3.39, 4.42, 6.48, 5.35, 5.28, 2.87), c = c(-0.30, -0.15, -1.37, 0.17, -0.12, -0.19, -0.12, 0.41, -0.27, 0.00, -0.19, -0.21, -0.91, -0.26, -0.20, 0.10), theta_minus.4 = c(-2.07, -2.04, -2.76, -2.32, -2.21, -2.33, -2.27, -2.29, -2.69, -3.80, -2.83, -1.74, -2.58, -3.09, -2.20, -1.57), theta_minus.3 = c(-1.25, -1.95, -1.92, -2.07, -1.62, -1.68, -2.04, -2.02, -1.84, -3.37, -1.89, -1.44, -2.31, -2.08, -1.53, -1.46), theta_minus.2 = c(-0.42, -1.40, -0.37, -1.96, -1.45, -1.27, -1.98, -1.66, -1.11, -2.69, -1.60, -1.25, -2.21, -1.68, -1.08, -1.17), theta_minus.1 = c(0.13, -0.90, 0.93, -1.71, -1.25, -0.59, -1.40, -1.00, -0.34, -1.65, -1.21, -0.76, -1.99, -0.92, -0.28, -0.99), theta_plus.1 = c(-0.62, 0.82, -2.77, 2.01, 1.39, 0.60, 1.51, 0.90, 0.18, 1.62, 0.99,0.88, 1.67, 0.92, 0.18, 0.88), theta_plus.2 = c(0.15, 1.45, -1.13,2.17, 1.61, 1.24, 1.99, 1.55, 0.96, 2.44, 1.53, 1.66, 2.00, 1.51, 1.08, 1.05), theta_plus.3 = c(1.40, 2.24, 0.77, 2.32, 1.80, 1.58, 2.19, 2.19, 1.54, 3.17, 1.86, 1.85, 2.16, 2.09, 1.47, 1.70), theta_plus.4 = c(2.19, 2.40, 1.75, 2.58, 2.53, 2.24, 2.59, 2.55, 2.58, 3.85, 2.87, 2.15, 2.51, 3.31, 2.27, 1.79), sigma = c(1.01, 0.64, 1.33, 0.39, 0.30, 0.75, 0.75, 1.07, 0.65, 0.29, 0.31, 0.78, 0.39, 0.42, 0.69, 0.52), w = c(0.54, 0.50, 0.38, 0.38, 0.36, 0.44, 0.48, 0.48, 0.52, 0.46, 0.53, 0.48, 0.29, 0.45, 0.51, 0.63)) # 2. Plot the predicted probabilities based on model and fitted parameters # against the observed relative frequencies. PlotFitWEV <- plotConfModelFit(MaskOri, pars, model="WEV") PlotFitWEV
# 1. Fit some models to each subject of the masked orientation discrimination experiment # Normally, the fits should be created using the function fitConfModels # Fits <- fitConfModels(data, models = "WEV", .parallel = TRUE) # Here, we create the dataframe manually because fitting models takes about # 10 minutes per model fit per participant on a 2.8GHz processor. pars <- data.frame(participant = 1:16, d_1 = c(0.20, 0.05, 0.41, 0.03, 0.00, 0.01, 0.11, 0.03, 0.19, 0.08, 0.00, 0.24, 0.00, 0.00, 0.25, 0.01), d_2 = c(0.61, 0.19, 0.86, 0.18, 0.17, 0.39, 0.69, 0.14, 0.45, 0.30, 0.00, 0.27, 0.00, 0.05, 0.57, 0.23), d_3 = c(1.08, 1.04, 2.71, 2.27, 1.50, 1.21, 1.83, 0.80, 1.06, 0.68, 0.29, 0.83, 0.77, 2.19, 1.93, 0.54), d_4 = c(3.47, 4.14, 6.92, 4.79, 3.72, 3.24, 4.55, 2.51, 3.78, 2.40, 1.95, 2.55, 4.59, 4.27, 4.08, 1.80), d_5 = c(4.08, 5.29, 7.99, 5.31, 4.53, 4.66, 6.21, 4.67, 5.85, 3.39, 3.39, 4.42, 6.48, 5.35, 5.28, 2.87), c = c(-0.30, -0.15, -1.37, 0.17, -0.12, -0.19, -0.12, 0.41, -0.27, 0.00, -0.19, -0.21, -0.91, -0.26, -0.20, 0.10), theta_minus.4 = c(-2.07, -2.04, -2.76, -2.32, -2.21, -2.33, -2.27, -2.29, -2.69, -3.80, -2.83, -1.74, -2.58, -3.09, -2.20, -1.57), theta_minus.3 = c(-1.25, -1.95, -1.92, -2.07, -1.62, -1.68, -2.04, -2.02, -1.84, -3.37, -1.89, -1.44, -2.31, -2.08, -1.53, -1.46), theta_minus.2 = c(-0.42, -1.40, -0.37, -1.96, -1.45, -1.27, -1.98, -1.66, -1.11, -2.69, -1.60, -1.25, -2.21, -1.68, -1.08, -1.17), theta_minus.1 = c(0.13, -0.90, 0.93, -1.71, -1.25, -0.59, -1.40, -1.00, -0.34, -1.65, -1.21, -0.76, -1.99, -0.92, -0.28, -0.99), theta_plus.1 = c(-0.62, 0.82, -2.77, 2.01, 1.39, 0.60, 1.51, 0.90, 0.18, 1.62, 0.99,0.88, 1.67, 0.92, 0.18, 0.88), theta_plus.2 = c(0.15, 1.45, -1.13,2.17, 1.61, 1.24, 1.99, 1.55, 0.96, 2.44, 1.53, 1.66, 2.00, 1.51, 1.08, 1.05), theta_plus.3 = c(1.40, 2.24, 0.77, 2.32, 1.80, 1.58, 2.19, 2.19, 1.54, 3.17, 1.86, 1.85, 2.16, 2.09, 1.47, 1.70), theta_plus.4 = c(2.19, 2.40, 1.75, 2.58, 2.53, 2.24, 2.59, 2.55, 2.58, 3.85, 2.87, 2.15, 2.51, 3.31, 2.27, 1.79), sigma = c(1.01, 0.64, 1.33, 0.39, 0.30, 0.75, 0.75, 1.07, 0.65, 0.29, 0.31, 0.78, 0.39, 0.42, 0.69, 0.52), w = c(0.54, 0.50, 0.38, 0.38, 0.36, 0.44, 0.48, 0.48, 0.52, 0.46, 0.53, 0.48, 0.29, 0.45, 0.51, 0.63)) # 2. Plot the predicted probabilities based on model and fitted parameters # against the observed relative frequencies. PlotFitWEV <- plotConfModelFit(MaskOri, pars, model="WEV") PlotFitWEV
This function generates a data frame with random trials generated according to
the computational model of decision confidence specified in the model
argument
with given parameters.
Simulations can be used to visualize and test qualitative model predictions
(e.g. using previously fitted parameters returned by fitConf
).
See fitConf
for a full mathematical description of all models
and their parameters.
simConf(model = "SDT", paramDf)
simConf(model = "SDT", paramDf)
model |
|
paramDf |
a
|
The function generates about N
trials per row with the provided parameters
in the data frame. The output includes a column participant
indicating the
row ID of the simulated data. The values of the participant
column may be
controlled by the user, by including a participant
column in the input
paramDf
. Note that the values of this column have to be unique! If no
participant
column is present in the input, the row numbers will be used
as row IDs.
The number of simulated trials for each row of parameters may slightly
deviate from the provided N
.
Precisely, if there are K levels of sensitivity (i.e. there are columns
d1, d2, ..., dK), the function simulates round(N/2/K)
trials per stimulus
identity (2 levels) and level of sensitivity (K levels).
Simulation is performed following the generative process structure of the models.
See fitConf
for a detailed description of the different models.
a dataframe with about nrow(paramDf)*N
rows (see Details),
and the following columns:
participant
giving the row ID of the simulation (see Details)
stimulus
giving the category of the stimulus (-1 or 1)
only, if more than 1 sensitivity parameter (d1
,d2
,...) is provided:
diffCond
representing the difficulty condition (values correspond to
the levels of the sensitivity parameters, i.e. diffCond=1 represents
simulated trials with sensitivity d1
)
response
giving the response category (-1 or 1, corresponding to the stimulus categories)
rating
giving the discrete confidence rating (integer, number of
categories depends on the number of confidence criteria provided in the parameters)
correct
giving the accuracy of the response (0 incorrect, 1 correct)
ratings
same as rating
but as a factor
Manuel Rausch, [email protected]
# 1. define some parameters paramDf <- data.frame(d_1 = 0, d_2 = 2, d_3 = 4,c = .0, theta_minus.2 = -2, theta_minus.1 = -1, theta_plus.1 = 1, theta_plus.2 = 2, sigma = 1/2, w = 0.5, N = 500) # 2. Simulate dataset SimulatedData <- simConf(model = "WEV", paramDf)
# 1. define some parameters paramDf <- data.frame(d_1 = 0, d_2 = 2, d_3 = 4,c = .0, theta_minus.2 = -2, theta_minus.1 = -1, theta_plus.1 = 1, theta_plus.2 = 2, sigma = 1/2, w = 0.5, N = 500) # 2. Simulate dataset SimulatedData <- simConf(model = "WEV", paramDf)