MAM Model Analysis

MAM Model Analysis

Purpose

This analysis looks at the overall panel performance in terms of Discrimination, Agreement and Repeatability or Reproducibility, and then the performance of each individual in the panel in these terms. Using a more sophisticated model, than the Panellist Performance analysis, to attempt to understand the different types of disagreement between panellists.

Data Format

  1. See the profiling dataset.
  2. The attributes should be of scale or interval type.
  3. Replications are required for this analysis.
  4. A complete design is required for this analysis. If for example an assessor misses a session then you could remove this assessor in the ‘Visualization & Selection’ step to create a complete design which will work with this analysis.

Background

This analysis fits a linear model for each attribute:

Attribute = Product + Assessor + Scale + Agreement + Residuals

Where the Scale and Agreement terms are from partitioning the interaction term in the following model:

Attribute = Product + Assessor + Product:Assessor + Residuals.

Options

  1. Significant Threshold (panel): Either 5%, 10% or 20%, the significance threshold used for determining if the panel performance is satisfactory on a particular term.
  2. Significant Threshold (panellist): Either 5%, 10% or 20%, the significance threshold used for determining if the panellist performance is satisfactory on a particular term.
  3. MAMCAP table output: Should the MAMCAP table, which attempts to communicate all panellist level results in one table, be included? Otherwise a simpler table is included.
  4. Number of Decimals for Values: The number of decimals places to round values to.
  5. Number of Decimals for P-Values: The number of decimals places to round p-values to. 

Results and Interpretation

ANOVA

For each attribute we have an ANOVA table from the panel level models. These can be interpreted in the same way as any other ANOVA table. The Assessor, Product and Scaling effects are tested against the Agreement Mean Square, instead of the Mean Square Error (MSE).

Discrimination

Discrimination here refers to the assessors’ ability to discriminate between products. We quantify this through the ANOVA models and the discrimination p-values are the p-values associated to the product effect in these ANOVA models. A lower discrimination p-value is usually more desirable because this suggests that the assessors are able to distinguish between the products.

The DISCRIMINATION table displays the panellist level discrimination p-values for each panellist against each attribute, highlighted “Good”, “Poor” or “Bad” based on commonly used thresholds.

The Panel level discrimination can be viewed either in the ANOVA tables or summarised in the PANEL PERFORMANCE TABLE. Which displays the discrimination F-values for each attribute and highlights these green if less than the chosen Significant Threshold (panel) or red if greater than.

Scaling

Scaling here refers to the assessors’ use of the scale when evaluating products, with an interest in whether assessors use the scale differently to each other. We quantify through the linear models and ANOVA. A higher scaling p-value is usually more desirable because this suggest that the assessors are using the scale in a similar manner.

The SCALING COEFFICIENTS table is a table of scaling coefficients for each panellist against each attribute. These are coefficients in the linear model for scaling, a coefficient greater than one suggests the panellist spreads their scores more than the panel, whereas a coefficient between 0 and 1 suggests the panellist spreads their scores more than the panel. A negative coefficient is also possible, suggesting for example the associated assessor may be using the scale in the wrong direction. How large a difference from 1 is statistically significant is addressed by the SCALING P-VALUES.

The SCALING P-VALUES table tabulates the p-values of statistically testing if the scaling coefficients are different from 1, for each judge and attribute. These are highlighted “Good”, “Poor”, or “Bad” based on commonly used thresholds.

The Panel level scaling can be viewed either in the ANOVA tables or summarised in the PANEL PERFORMANCE TABLE. Which displays the scaling F-values for each attribute and highlights these green if greater than the chosen Significant Threshold (panel) or red if less than.  

Agreement

Agreement here refers to the level of consensus between assessors.  Unlike in the Panellist Performance analysis, here scaling effects are separated out from agreement, this is sometimes referred to as “pure agreement” in the literature.

We quantify this through the linear models and ANOVA.  A higher agreement p-value is more desirable because this suggests that there is broad consensus between assessors.

The Panel level agreement can be viewed either in the ANOVA tables or summarised in the PANEL PERFORMANCE TABLE. Which displays the agreement F-values for each attribute and highlights these green if greater than the chosen Significant Threshold (panel) or red if less than.

The AGREEMENT table displays the panellist level agreement p-values for each panellist against each attribute, highlighted “Good”, “Poor” or “Bad” based on commonly used thresholds.

Repeatability

Repeatability is a measure of the consistency of the assessors in evaluating the same products.

On the panel level this is calculated from the root mean square error of the appropriate ANOVA and displayed in the ANOVA RESULTS and summarised in PANEL PERFORMANCE table. A smaller root mean square error is generally preferable.

On the panellist level we fit the ANOVA models:

 Attribute = Product + Residuals

For each panellist we then use F-tests to compare across panellists. The results are tabulated in the REPEATABILITY table, for every panellist and attribute, and are highlighted “Good”, “Poor” or “Bad” by commonly used p-value thresholds.  

MAMCAP

The MAMCAP table attempts to communicate all panellist level results in one table. It is a table of panellist against attribute. Agreement is displayed by the colour coding of the table, while other terms are communicated via symbols.

If the MAMCAP table is not selected, then the OVERALL SUMMARY table is output instead. This displays a count of the number of attributes the preferable side of the chosen Significant Threshold (panellist) for each panellist and term. 

Summary tables

The PANELLIST SUMMARY (%) table summarises the panellist level results, for each measure of performance giving the proportion of attributes a panellist has an associated p-value the preferable side of the chosen Significant Threshold (panellist). This is the case as a significant discrimination p-value suggests the panellist can distinguish the products but a significant agreement p-value suggests the panellist is in disagreement with the panel.

Similarly, the PANEL SUMMARY (%) summarises the results in the PANEL PERFORMANCE table, giving the proportion of attributes the preferable side of the chosen Significant Threshold (panel) for each term (except Repeatability).

Information

Attributes with zero variance will be removed from the analysis with a warning included in this table.

Technical Information

  1. This analysis uses sum to zero contrasts when fitting the linear models. 

References

[1]

C. Peltier, P.B. Brockhoff, M. Visalli, P. Schlich, “The MAM-CAP table: A new tool for monitoring panel performances”, Food Quality and Preference, vol. 32, part A, pp. 24-27, 2014.

[2]

Sofie Pødenphant, Minh H. Truong, Kasper Kristensen, Per B. Brockhoff, “The Mixed Assessor Model and the multiplicative mixed model”, Food Quality and Preference, vol. 74, pp. 38-48, 2019.



    • Related Articles

    • Penalty Analysis

      Purpose To provide a penalty analysis of a consumer data set, that is to investigate how liking or acceptability of product decreases when product attributes are not at the optimal intensity. Data Format Example dataset: Consumer.xlsx Note: for ...
    • Same/Different Test Analysis

      Available from version: 5.0.8.6 Purpose The Same Different Test is a discrimination test that is a variation of the paired comparison test. The assessor is presented with two samples and is asked to decide whether these samples are the same or ...
    • Panelist Performance Analysis

      Purpose This analysis looks at the overall panel performance in terms of Discrimination, Agreement and Repeatability or Reproducibility, and then the performance of each individual in the panel in these terms. Data Format See the profiling dataset. ...
    • Quality Index Analysis

      Available from version: 5.4.4 Purpose The Quality Index is an ANOVA based analysis, with the idea based on the paper by Verhoef (2015) for the purpose of measuring the reliability of univariate sensory descriptive data. Data format profiling.xlsx The ...
    • Correspondence Analysis (CATA and categorical data)

      Purpose To visualise and summarise analyse tabular data and to highlight the patterns of association in two way tables. It is widely used for mapping pure qualitative variables – e.g cluster by demographic use. This is an example of typical data that ...