The peer review process is commonly used to determine how to allocate research funding, and reviewer reliability, or consistency, is an integral factor to this process. However, factors other than the review criteria, like reviewer and applicant gender, could influence scoring, leading to the potential for bias. An experimental study of gender, application quality and reviewer reliability was undertaken via a collaboration between Washington State University and the American Institute of Biological Sciences, using evaluations of mock grant applications. While moderate reliability (including test-retest reliability) was observed overall, this decreased with applications with weaknesses. Also, women reviewers were generally more positive in their evaluations than men reviewers. Finally, women investigators received better overall evaluations than men investigators, with no interactions between reviewer and investigator gender. In light of these results, the usefulness of reviewer calibration and training should be explored. This research was supported by the National Science Foundation (1951132 and 1951251).