GSoC 2020 Ideas
For project maintainers
This is a list of ideas for students who are considering to apply to Google Summer of Code 2020 for Haskell.org. You can contribute ideas by sending a pull request to our github repository. If you just want to discuss a possible idea, please contact us.
For students
Please be aware that:
- This is not an all-inclusive list, so you can apply for projects not in this list and we will try our best to match you with a mentor.
- You can apply for as many ideas as you want (but only one can be accepted).
- Some general tips on writing a proposal are discussed here.
Table of Contents
- Documentation generator for the Dhall configuration language
- Faster factorization algorithms
- Build-integration and Badges for Hackage
- Finish the package candidate workflow for Hackage
- Lua interface to the http-client library
- Property-based testing stateful programs using QuickCheck
Documentation generator for the Dhall configuration languageđź”—
The Dhall configuration language is a programmable configuration language designed to balance ease of maintenance with general-purpose programming language features. The Dhall language has multiple independent implementations, each of which binds to a different host programming language, similar to how JSON or YAML can be read into multiple programming languages. However, a large number of supporting tools are built on top of the Haskell implementation, mainly because that was the first Dhall implementation.
One supporting tool of interest is a documentation generator. Up until now, Dhall packages have been mostly hosted on GitHub/GitLab and documentation consists of inline comments within source files, such as this one:
Many users have requested a more polished solution for generating documentation from these commented source files, analogous to Haskell’s haddock tool (a documentation generator for Haskell), which they can then host (as HTML) or include within their Dhall projects (as Markdown files checked into version control). There have even been some nascent attempts to implement this, such as:
To that end, the goal of this project is to implement a command line documentation generator whose input is a directory tree containing a Dhall package and whose output is documentation in either markdown or HTML form. The scope of this project does not include hosting documentation on behalf of users. In other words, this project will only build a Dhall analog of haddock and will not attempt to build a Dhall analog of Hackage.
This project should be appropriate for an beginning Haskell programmer with web development experience. The amount of Haskell code required to write the first draft of the project should be small and there will be many opportunities within the project to exercise web development skills to improve the visual appeal, user experience, and ease of comprehension of the generated documentation.
The project scope can also be extended depending on how things progress by adding features common to other documentation generators, such as:
- Rendering tests (which are natively supported by the language)
- Browsing the original source code
- Type on hover (within the rendered source code)
- Jump to definition (within the rendered source code)
Potential Mentors: Gabriel Gonzalez
Difficulty: Beginner
Faster factorization algorithmsđź”—
There is a growing (and coming-of-age) ecosystem of Haskell packages for cryptography, witnessed by increasing number of blockchain and zero knowledge protocols. This project aims to fill one of remaining gaps: state-of-the-art algorithms for integer factorization.
The most advanced existing Haskell implementations still use integer factorization over elliptic curves (example). But there is a modern family of vastly superior and faster methods of factorization: number field sieves. The goal is to implement them as a separate Haskell library or as a part of arithmoi package.
To reach the goal the candidate could implement the quadratic sieve, achieving decent performance characteristics. If there is still time left, we will proceed with the general number field sieve.
The candidate should have a basic knowledge of linear algebra and number theory and be willing to learn more. This project may be a good fit for students with a strong mathematical background, but little practice in Haskell, because it is self-contained and involves neither scary types nor arcane interfaces.
Mentors: Andrew Lelechenko, Sergey Vinokurov.
Difficulty: Intermediate.
Build-integration and Badges for Hackageđź”—
The hackage docbuilder currently only gives a pass/fail with generated documentation or failure logs. Ideally we should be able to infer and present a lot more interesting data about packages to encourage package maintainers. Existence of test-suites and the extent of their coverage, success of builds with different versions, existence of benchmark suites, even extent of documentation can all be recognized with badges or shields.
This work involves extending the existing docbuilder to run more detailed builds and report more detailed data, as well as extending the Hackage UI to better display data both within cabal metadata and also as generated by the builder.
Additionally, it would be good to rearchitect the builder so that it doesn’t store its “unbuildable” set locally, but instead is locally stateless and driven by polling the hackage server for instructions – this allows better scale-out and parallelization of builders, as well as distribution of work.
Potential Mentors: Gershom Bazerman, Herbert Valerio Riedel
Difficulty: Intermediate
Finish the package candidate workflow for Hackageđź”—
Hackage candidate packages currently cannot be used directly, and their UI could be improved. We would like to have new packages be uploaded as candidates by default, to improve the vetting process. But this means polishing off candidate functionality. The main issues left to do are tracked here
The first step is moving the candidate display page to the new templating system and sharing code with the main package page. Following this, we need to implement a new candidate index, able to be provided as a secondary index. This would be a “v1” index, and mutable.
Beyond this we want to extend the docbuilder and docuploads to work with candidates, and then implement a fixed workflow from candidacy to validation and then publishing.
Mentors: Gershom Bazerman, Herbert Valerio Riedel
Difficulty: Intermediate
Lua interface to the http-client libraryđź”—
The HsLua library allows to embed an interpreter for the Lua programming language into programs written in Haskell. One example is pandoc, which uses Lua as extension language allowing users to author custom writers or to modify pandoc’s internal document representation.
HsLua allows to expose Haskell functions to Lua scripts, thereby enabling users to access functionality otherwise hidden in a program’s internals. Lua bindings to http-client, a popular, easy to use, and powerful Haskell HTTP library, are currently lacking. Such bindings could give pandoc users great additional power without the need for external C Lua libraries.
The candidate, who should be familiar with Haskell and Lua knowledge as an optional bonus, could
choose Haskell functions which would be most useful to Lua users;
write bindings for these functions, as well as tests for those bindings;
publish the bindings as a library on Hackage and Stackage.
If time permits, the new library could be included in pandoc.
Mentors: Albert Krewinkel
Difficulty: Intermediate to advanced.
Property-based testing stateful programs using QuickCheckđź”—
When the first version of QuickCheck was released for Haskell it was the state-of-the-art in testing. Today however it’s lagging behind, for example, Erlang’s PropEr and eqc libraries. The quickcheck-state-machine library is an attempt to add state machine modelling to Haskell’s QuickCheck for testing stateful/monadic code, and thereby catch up with the Erlang versions of QuickCheck.
This proposal is about using, and possibly extending, quickcheck-state-machine in order to improve the quality of Haskell code in general and for a specific project in particular.
The intermediate candidate could:
Find a commonly used and stateful Haskell library or application to test. This can also be a toy library or application from a commonly used Haskell resource (e.g. a tutorial, book or blog post);
Write a state machine model, for said library or application, together with at least a sequential property, and possibly a parallel property as well;
Getting this far would already reach the goal, but if there’s enough time the candidate could in addition to the above also try to do one of the following items:
- Add fault injection to the model, and thereby test the robustness of the code;
- Turn the state machine model into a mock, like described here, and implement and test a library or application that depends on the original library or application using the mock.
The advanced candidate could additionally try to one of the following items:
Mentors: Stevan Andjelkovic
Difficulty: Intermediate to advanced
Summer of Haskell