Communications in Biometry and Crop Science

Communications
in Biometry and Crop Science

 

 

Contents

REGULAR ARTICLE
Combining partially ranked data in plant breeding and biology: I. Rank aggregating methods

Ivan Simko, Dov A. Pechenick


Communications in Biometry and Crop Science (2010) 5 (1), 41-55.
 

ABSTRACT
Combining heterogeneous data from plant breeding trials into a single dataset can be challenging, especially if observations have been performed only on partially overlapping sets of accessions, or if evaluations were done with different rating scales. In the present work we propose combining such data by making use of aggregate ranking approaches. To test 13 aggregate ranking methods for performance, we have simulated 16 types of datasets that resemble those observed in plant breeding trials. The evaluation of aggregate ranking methods was carried out using both distance-based measures (Kendall’s tau and Spearman’s rho) and number of rank violations caused by a proposed aggregate ranking. Our analysis indicates that methods based on Bradley-Terry or Rasch models performed better than the other tested methods when factors such as fitness of aggregate rankings, time required for analyses, and ability to analyze weak rankings were considered. Verification of the approach on real data from 19 studies indicated a substantial increase in significance (P-value dropped by a factor of 100,000) when linkage between a marker and a trait was based on aggregated data rather than on each of the individual trials. The ability to combine heterogeneous data from independent studies has important ramifications for data analysis in association studies. Results from our study indicate that this kind of meta-analysis is more powerful than individual analyses.
 

Key Words: Adjusted means; Bradley-Terry model; Markov chains; partially ranked data; Plackett-Luce model; rank-aggregation; Rasch model; sampling methods.