HumanELY: Human Evaluation of Large Language model Yield

To provide a structured way to perform human evaluation, we propose the first and the most comprehensive guidance and a web application called HumanELY. Our approach and tools derived from commonly used evaluation metrics helps perform evaluation of large language model outputs in a comprehensive, consistent, measurable and comparable manner.