**SPECIALIZABLE GHC pragma**

Student: Francesco Gazzetta @fgaz

Mentors: Carter Schonwald, Andreas Klebinger, chessai

Student report**Add primops to expand the (boxed) array API**

Student: buggymcbugfix

Mentors: andrewthad, Andreas Klebinger, chessai

Student report**Build-integration and Badges for Hackage**

Student: Shubham Awasthi

Mentors: hvr, Gershom Bazerman

Student report**Building the Haskell Language Server and more**

Student: Luke Lau

Mentors: Alan Zimmerman, Pepe Iborra, Zubin Duggal

Student report**Custom Dataloader for Hasktorch**

Student: Andre Daprato

Mentors: Austin Huang, Adam Paszke, Torsten Scholak, Junji Hashimoto**Documentation generator for the Dhall configuration language**

Student: German Robayo

Mentors: Profpatsch, Gabriel Gonzalez, sjakobi

Student report**Finish the package candidate workflow for Hackage**

Student: Sitao Chen

Mentors: hvr, Gershom Bazerman

Student reportThis summer, I have participated in Google Summer of Code with Haskell org and worked on Hackage candidate UI and workflow. Without previous experience in open source development, I was able to grasp a large codebase and its structure in a short period with the help of my mentors. Besides, I got a chance to learn about how to make API calls and how to improve UI using Haskell in a formal setting. This experience helps me have a better understanding of packages workflow management and web services in Haskell. I wish I can contribute again in the future!

**Functional Machine Learning Algorithms for Music Generation**

Student: Elizabeth Wilson

Mentors: Alex McLean, Austin Huang, Torsten Scholak

Student report**Multiple Home Packages for GHC**

Student: fendor

Mentors: Zubin Duggal, John Ericson, Matthew Pickering

Student reportHaskell IDE Engine was the first open source project I ever contributed to, and over time, it became of a project of passion for me. Over the months I dove deeper into Haskell tooling, until I got the chance to work on GHC itself in this year’s Google Summer of Code! I worked on this project to improve the tooling situation for Haskell, as well as improving the IDE experience by implementing features needed by both.

The project itself proved to be challenging, mainly because of my unfamiliarity with the GHC code base. However, with the help of my helpful mentors, I was able to overcome the challenges and learned a lot about GHC. I am glad I had the chance to work on this project, although I did not accomplish everything I wanted to, yet.

**Number Field Sieves**

Student: Federico Bongiorno

Mentors: Sergey Vinokurov, Andrew Lelechenko

Student report**Optimising Haskell developer tool performance using OpenTelemetry**

Student: Michalis Pardalos

Mentors: Dmitry Ivanov, Matthew PickeringMy project was about adding support for opentelemetry tracing into ghcide, the core component of haskell-language-server. I had very little experience with open-source development, or the internals of haskell and ghc before this project and I can say for sure that this has changed. Aside from working on ghcide itself, I also had to submit patches to haskell-opentelemetry, implementing features necessary for this project. When the project was blocked by a ghc bug, I also took this as an opportunity to dive into ghc and fix it myself, which I found incredibly rewarding and consider a valuable experience.

Even though I ended up running out of time and not finishing everything I hoped for in the project, I can say for sure that it was a positive experience which I would absolutely recommend.

**Update stylish-haskell to use ghc-lib-parser**

Student: Beatrice Vergani

Mentors: Jasper Van der Jeugt, lukaszgolebiewski, Paweł Szulc

Student report

Google will be hosting GSoC again in 2021, and of course we plan to apply again. If you have ideas for projects that students could work on, we’ll be using the same format as the years before – this page has more information on how to submit an idea.

Thanks a lot to everyone involved!

]]>Haskell.org has been able to take part in this program in the past two years, and we’d like to keep this momentum up since it greatly benefits the community.

Google is not extremely open about what factors it considers for applications from organizations, but they have stated multiple times that a well-organized ideas list is crucial. For that, we would like to count on all of you again.

If you are the maintainer or a user of a Haskell project, and you have an improvement in mind which a student could work on during the summer, please submit an idea here:

https://summer.haskell.org/ideas.html

For context, Google Summer of Code is a program where Google sponsors students to work on open-source projects during the summer. Haskell.org has taken part in this program in 2006-2015, and 2018-2019. Many important improvements to the ecosystem have been the direct or indirect result of Google Summer of Code projects, and it has also connected new people with the existing community.

Projects should benefit as many people as possible – e.g. an improvement to GHC will benefit more people than an update to a specific library or tool, but both are definitely valid. New libraries and applications written in Haskell, rather than improvements to existing ones, are also accepted. Projects should be concrete and small enough in scope such that they can be finished by a student in three months. Past experience has shown that keeping projects “small” is almost always a good idea.

]]>Unfortunately; this summary is less successful – I meant to contact the students immediately after the summer, but that mail never went through and I failed to follow up on it – my apologies.

In either case, I still wanted to list the successful projects here for posterirty. I reached out to the students again and will be updating this post with more information and quotes as they get back to me.

**A language server for Dhall**

Student: Frederik Ramcke

Mentors: Luke Lau, Gabriel Gonzalez**A stronger foundation for interactive Haskell tooling**

Student: dxld

Mentors: Alan Zimmerman, Matthew Pickering**Automated requirements checking as a GHC plugin**

Student: Daniel Marshall

Mentors: Chris Smith, chessai, Alphalambda**Extending Alga**

Student: O V Adithya Kumar

Mentors: Andrey Mokhov, Jasper Van der Jeugt, Alexandre Moine**Extending Hasktorch With RNNs and Encoder-Decoder**

Student: AdLucem

Mentors: Austin Huang, Junji Hashimoto, Sam Stites**Functional Machine Learning with Hasktorch: Produce Functional Machine Learning Model Reference Implementations**

Student: Jesse Sigal

Mentors: Austin Huang, idontgetoutmuch, Junji Hashimoto, Sam Stites**Hadrian Optimisation**

Student: ratherforky

Mentors: Andrey Mokhov, Neil Mitchell**Implementing Chebyshev polynomial approximations in Haskell: Having the speed and precision of numerics with complex, non-polynomial functions.**

Student: Deifilia To

Mentors: tmcdonell, idontgetoutmuch, Albert Krewinkel**Improving Hackage Matrix Builder as a Real-world Fullstack Haskell Project**

Student: Andika Riyandi (Rizary)

Mentors: Herbert Valerio Riedel, Robert Klotzner**Improving HsYAML Library**

Student: Vijay Tadikamalla

Mentors: Herbert Valerio Riedel, Michał Gajda**Issue-Wanted Web Application**

Student: Rashad Gover

Mentors: Veronika Romashkina, Dmitrii Kovanikov**More graph algorithms for Alga**

Student: Vasily Alferov

Mentors: Andrey Mokhov, Alexandre Moine**Property-based testing stateful programs using quickcheck-state-machine**

Student: Kostas Dermentzis

Mentors: stevana, Robert Danitz**Putting hie Files to Good Use**

Student: Zubin Duggal

Mentors: Alan Zimmerman, Matthew Pickering**Upgrading hs-web3 library**

Student: amany9000

Mentors: Alexander Krupenkin, Thomas Dietert

Thanks to everyone involved!

]]>When you apply to Summer of Code, you write a proposal. The proposal is a document in which you describe your ideas on the chosen project. It should be a clear, detailed text with suggestions on every subtask. The proposal should also include a timeline, in which you estimate the time you intend to spend on each of those subtasks.

I chose this project for my summer. In my proposal, I drafted all the algorithms mentioned in the list and suggested a few more. I published this part of my proposal as a Github gist there.

I don’t suggest this gist as a complete example of a good proposal: it’s only a part of the document I submitted. You should also include some information about you, together with the timeline. Communication with your future mentors is also a significant part of the application.

However, as I mentioned in one of my previous posts, another student ended up doing the part suggested in the ideas list. So my task is to introduce bipartite graphs.

This task was my idea. I mentioned it in my proposal. I meant that finding maximum matchings in bipartite graphs should be easily implemented when we have algorithms for finding maximum flows in networks. Kuhn’s algorithm is an application of the Ford-Fulkerson algorithm, and the Hopcroft-Karp algorithm is an application of Dinic’s algorithm.

However, this option is not the best. Both algorithms have specialized implementations that work times faster. So my task for this summer was to introduce bipartite graphs and special functions for working with them.

I made four pull requests to Alga this summer. Each pull request represents a separate task and summarizes the work of several weeks.

Each PR contains the actual implementation, tests, and documentation. The whole project is release-ready after merging each one of them. I put the tests in the `test/`

directory. The documentation for each function and datatype precedes the declaration. After release, it will compile to beautiful Haddock file like this.

**Link to PR:** https://github.com/snowleopard/alga/pull/207

In this part, I defined the `Bipartite.AdjacencyMap`

datatype and added many functions to work with adjacency maps.

The datatype represents a map of vertices into their neighbours. I defined it as two maps:

```
data AdjacencyMap a b = BAM {
leftAdjacencyMap :: Map.Map a (Set.Set b),
rightAdjacencyMap :: Map.Map b (Set.Set a)
}
```

The properties are based on the existing properties of graphs in Alga.

**Link to PR:** https://github.com/snowleopard/alga/pull/218

There is a folklore algorithm that checks if a given graph is bipartite. The task to implement this algorithm in Haskell was a little challenging for me.

I finished up with the following definition of the function:

`detectParts :: Ord a => AM.AdjacencyMap a -> Either (OddCycle a) (AdjacencyMap a a)`

It is known that a graph is bipartite if and only if it contains no cycles of odd length. This function either finds an odd cycle or returns a partition.

The implementation is so exciting that I wrote a whole post about it. I explained the reason I needed monad transformers there and made some interesting benchmarks that pointed me to use the explicit `INLINE`

directive.

**Link to the unfinished PR**: https://github.com/snowleopard/alga/pull/226

Some families of graphs are bipartite: simple paths, even cycles, trees, bicliques, etc. The task is to provide a simple method to construct all those graphs.

The most exciting part of this task was to provide type-safe implementations. For example, only cycles of even length are bipartite. And speaking of paths, we should provide a method for constructing paths of vertices of two different types.

The `circuit`

definition for constructing graphs containing one even cycle is simple:

`circuit :: (Ord a, Ord b) => [(a, b)] -> AdjacencyMap a b`

For the paths, I added a special type for alternating lists:

`data List a b = Nil | Cons a (List b a)`

So the `path`

definition is:

`path :: (Ord a, Ord b) => List a b -> AdjacencyMap a b`

As for now, the PR is almost merge-ready, only several small comments need fixes.

**Link to the unfinished PR**: https://github.com/snowleopard/alga/pull/229

This algorithm is the fastest one for maximum matchings in bipartite graphs. The implementation is rather straightforward.

However, there is an aspect of this PR I’d like to share there.

I implemented the following function:

```
augmentingPath :: (Ord a, Ord b) => Matching a b
-> AdjacencyMap a b
-> Either (VertexCover a b) (List a b)
```

Given a matching in a graph, it returns either an augmenting path for the matching or a vertex cover of the same size, thus proving that the given matching is maximum. As both outcomes can be easily verified, this helps to write perfect tests that ensure that the matching returned by my function is maximum indeed.

This PR still needs some work. The reason is that two different implementations behave weirdly on the benchmarks.

I wrote a lot of Haskell this summer. This gave me a lot of experience in this language. Although there’s still work to be done, I’m satisfied with the results I got.

I adore the way functional programs are developed. I was surprised to know how popular testing (QuickCheck) and benchmarking (Criterion) frameworks are organized. And preciseness of the documentation makes the work a lot easier.

]]>A graph is called bipartite if its vertices can be split into two parts in such way that there are no edges inside one part. While testing graph on tripartiteness is NP-hard, there is a linear algorithm that tests graph on bipartiteness and restores the partition.

The algorithm is usually one of the first graph algorithms given in any university course. The idea is rather straightforward: we try to assign vertices to the left or right part in some way, and when we get a conflict, we claim that the graph is not bipartite.

First, we assign some vertex to the left part. Then, we can confidently say that all neighbours of this vertex should be assigned to the right part. Then, all neighbours of this vertex should be assigned to the left part, and so on. We continue this until all the vertices in the connected component are assigned to some part, then we repeat the same action on the next connected component, and so on.

If there is an edge between vertices in the same part, one can easily find an odd cycle in the graph, hence the graph is not bipartite. Otherwise, we have the partition, hence the graph is bipartite.

There are two common ways of implementing this algorithm in linear time: using Depth-First Search or Breadth-First Search. We usually select DFS for this algorithm in imperative languages. The reason is that DFS implementation is a little bit simpler. I selected DFS, too, as a traditional way.

So, now we came to the following scheme. We go through the vertices in DFS order and assign them to parts, flipping the part when going through an edge. If we try to assign some vertex to some part and see that it is already assigned to another part, then we claim that the graph is not bipartite. When all vertices are assigned to parts and we’ve looked through all edges, we have the partition.

In Haskell, all computations are supposed to be *pure*. Still, if it was *really* so, we wouldn’t be able to print anything to the console. And what I find most funny about pure computations is that they are so lazy that there is no pure reason to compute anything.

Monads are the Haskell way to express computations with *effects*. I’m not going to give a complete explanation of how they work here, but I find this one very nice and clear.

What I **do** want to notice there is that while some monads, like `IO`

, are implemented through some deep magic, others have simple and pure implementations. So the entire computation in these monads is pure.

There are many monads that express all kinds of effects. It is a very beautiful and powerful theory: they all implement the same interface. We will talk about the three following monads:

`Either e a`

— a computation that returns value of type`a`

or throws an error of type`e`

. The behaviour is very much like exceptions in imperative languages and the errors may be caught. The main difference is that this monad is fully logically implemented in the standard library, while in imperative languages it is usually implemented by the operating system or virtual machine.`State s a`

— a computation that returns value of type`a`

and has an access to a modifiable state of type`s`

.`Maybe a`

. A`Monad`

instance for`Maybe`

expresses a computation that can be at any moment interrupted with returning`Nothing`

. But we will mostly speak of`MonadPlus`

instance, which expresses a vice versa effect: this is a computation which can be at any moment interrupted with returning a concrete value.

We have two data types, `Graph a`

and `Bigraph a b`

, first of them representing graphs with vertex labels of type `a`

and second representing bipartite graphs with left part labels of type `a`

and right part labels of type `b`

.

**A Word of Warning**: These are not Alga data types. Alga representation for bipartite graphs is not yet released and there is no representation for undirected graphs.

We also assume that we have the following functions.

```
-- List of neighbours of a given vertex.
neighbours :: Ord a => a -> AM.AdjacencyMap a -> [a]
-- Convert a graph with vertices labelled with their parts to a bipartite
-- graph, ignoring the edges within one part.
toBipartiteWith :: (Ord a, Ord b, Ord c) => (a -> Either b c)
-> Graph a
-> Bigraph b c
-- List of vertices
vertexList :: Ord a => AM.AdjacencyMap a -> [a]
```

Now we write the definition for the function we are going to implement.

```
type OddCycle a = [a]
detectParts :: Ord a => Graph a -> Either (OddCycle a) (Bigraph a a)
```

It can be easily seen that the odd cycle is at the top of the recursion stack in case we failed to find the partition. So, in order to restore it, we only need to cut everything from the recursion stack before the first occurrence of the last vertex.

We will implement a Depth-First Search, while maintaining a map of part identifiers for each vertex. The recursion stack for the vertex in which we failed to find the partition will be automatically restored with the `Functor`

instance for the monad we choose: we only need to put all vertices from the path into the result on our way back from the recursion.

The first idea is to use the `Either`

monad, that fits perfectly well to our goals. The first implementation I had was something very close to that. In fact, I had five different implementations at some point to choose the best, and I finally stopped at another option.

First, we need to maintain a map of effects — this is something about `State`

. Then, we need to stop when we found a conflict. This could be either `Monad`

instance for `Either`

or `MonadPlus`

instance for `Maybe`

. The main difference is that `Either`

has a value to be returned in case of success while `MonadPlus`

instance for `Maybe`

only returns a value in case we failed to find the partition. As we don’t need a value because it’s already stored in `State`

, we choose `Maybe`

. Now, we need to combine two monadic effects, so we need monad transformers, which are a way to combine several monadic effects.

Why had I chosen such complicated type? There are two reasons. The first is that the implementation becomes very similar to one we have in imperative languages. The second is that I needed to manipulate the value returned in case of conflict to restore the odd cycle, and this becomes much simpler in `Maybe`

.

So, here we go now.

```
{-# LANGUAGE ExplicitForAll #-}
{-# LANGUAGE ScopedTypeVariables #-}
data Part = LeftPart | RightPart
otherPart :: Part -> Part
LeftPart = RightPart
otherPart RightPart = LeftPart
otherPart
type PartMap a = Map.Map a Part
type OddCycle a = [a]
toEither :: Ord a => PartMap a -> a -> Either a a
= case fromJust (v `Map.lookup` m) of
toEither m v LeftPart -> Left v
RightPart -> Right v
type PartMonad a = MaybeT (State (PartMap a)) [a]
detectParts :: forall a. Ord a => Graph a -> Either (OddCycle a) (Bigraph a a)
= case runState (runMaybeT dfs) Map.empty of
detectParts g Just c, _) -> Left $ oddCycle c
(Nothing, m) -> Right $ toBipartiteWith (toEither m) g
(where
inVertex :: Part -> a -> PartMonad a
= ((:) v) <$> do modify $ Map.insert v p
inVertex p v let q = otherPart p
| u <- neigbours v g ]
msum [ onEdge q u
{-# INLINE onEdge #-}
onEdge :: Part -> a -> PartMonad a
= do m <- get
onEdge p v case v `Map.lookup` m of
Nothing -> inVertex p v
Just q -> do guard (q /= p)
return [v]
processVertex :: a -> PartMonad a
= do m <- get
processVertex v `Map.notMember` m)
guard (v LeftPart v
inVertex
dfs :: PartMonad a
= msum [ processVertex v | v <- vertexList g ]
dfs
oddCycle :: [a] -> [a]
= tail (dropWhile ((/=) last c) c) oddCycle c
```

I’ll try to explain each of the first four scoped functions: this is the core of the algorithm.

`inVertex`

is the part of DFS that happens when we visit the vertex for the first time. Here, we assign the vertex to the part and launch`onEdge`

for every incident edge. And that’s the place where we hope to restore the call stack: if a`Just`

is returned from sum edge, we add`v`

to the beginning.`onEdge`

is the part that happens when we visit any edge. It happens twice for each edge. Here we check if the vertex on the other side is visited. If not, we visit it. Else we check whether we found an odd cycle. If we did, we simple return the current vertex as a singleton. The other vertices from the path are added at the way back from the recursion.`processVertex`

checks if the vertex is visited and runs DFS on it if not.`dfs`

runs`processVertex`

on all vertices.

That’s it.

When I first wrote the above code, `action`

was not explicitly inlined. Then, when I was benchmarking different versions of `detectParts`

to select the best, I noticed that on some graphs this version with transformers had a serious overhead over the version with `Either`

. I had no idea of what was going on, because semantically two functions were supposed to perform the same operations. And it became even weirder when I ran it on another machine with another version of GHC and didn’t notice any overhead there.

After a weekend of reading GHC Core code, I managed to fix this with one explicit inline. At some point between GHC 8.4.4 and GHC 8.6.5 they changed the optimizer in some way that it didn’t inline `action`

.

This is just a crazy thing about programming I didn’t expect to come through with Haskell. Still, it seems that the optimizers make mistakes even in our time and it is our job to give them hints of what should be done. For example, here we knew that the function should be inlined as it is in the imperative version, and that’s a reason to give GHC a hint.

When this patch is merged, I’m going to start implementing Hopcroft-Karp algorithm. I think the BFS part is going to be rather interesting, so the next blog post will come in a couple of weeks.

]]>The idea of the project was on the ideas list published earlier. Two of us were accepted for this project, the other one being Adithya Kumar and who will be doing the work described on the ideas list. He told me his GSoC blog will probably be here.

My task is to introduce bipartite graphs to Alga and that is what I am going to tell you about now.

There are three common ways to represent graphs in computing:

- Adjacency matrix
- Adjacency lists
- Edge lists.

All three of them have their advantages and disadvantages. The most commonly used is the adjacency lists approach: that is storing a list of neighbors for each vertex. In fact, I can think of only one common algorithm for which this approach is not perfect: it is Kruskal’s algorithm for finding the minimum spanning tree.

However, the problem is that feeding graphs formed this way to algorithms is not always safe. For example, if the algorithm is designed for bidirectional graphs, it may rely on the fact that if some vertex `u`

is in the list of neighbors of some another vertex `v`

then `v`

is in the list of neighbors of `u`

.

A traditional solution for functional programming would be to guarantee the consistency of input data for the algorithm by taking a representation of the graph that would not allow a wrong graph to be passed. That’s what we call *type safety*.

Alga is a library that provides such a safe representation with a beautiful algebraic interpretation. It also has a nice set of algorithms out of the box. You can find the paper on Alga by its author here, I’m just going to provide some basics.

Consider the following definition for the graph data type:

```
data Graph a = Empty
| Vertex a
| Overlay (Graph a) (Graph a)
| Connect (Graph a) (Graph a)
```

The constructors mean the following:

`Empty`

constructs an empty graph.`Vertex v`

constructs a graph of single vertex labeled`v`

.`Overlay g h`

constructs a graph with sets of vertices and edges united from graphs`g`

and`h`

.`Connect g h`

does the same as`Overlay`

and also connects all vertices of`g`

to all vertices of`h`

.

One can easily construct a `Graph`

of linear size having a list of edges of the desired graph. In fact, this approach may even save memory for dense graphs comparing to adjacency lists. And this approach is surely *type safe* in the sense described above. Comparing to adjacency lists, there is no problem with an edge not present in the list of neighbours of another vertex. Another possible problem with adjacency lists not present here is that an edge might lead to a vertex with no associated adjacency list.

Why algebraic? Well, if we write down simple laws for these graphs we will see that laws for `Connect`

and `Overlay`

operations are very similar to those for multiplication and addition in a semiring, respectively.

This was just a brief description of Alga. There are many other parts not covered here. One example is that `Graph`

might also be provided as a type class rather than a data type. This approach is much more flexible.

An important part of Alga is providing different type-safe representations for different kinds of graph. For example, one for edge-labeled graphs was introduced last year.

Another option is to add a representation that *restricts* the set of possible graphs. One example from the ideas list is to represent only acyclic directed graphs. This is what Adithya will be doing. And my task for the first evaluation period is to provide bipartite graphs.

We often meet bipartite graphs in real world: connections between entities of different kinds are common. For example, graph of clients and backends they use is bipartite. Another example I can think of is about content recommendation systems: graph of users and films or songs they like is bipartite, too.

There are many ideas on how to do so. For example, in my proposal I suggested an approach that seems to match Alga’s design:

```
data Bigraph a b = Empty
| LeftVertex a
| RightVertex b
| Overlay (Bigraph a b) (Bigraph a b)
| Connect (Bigraph a b) (Bigraph a b)
```

Here, `Connect`

only connects left vertices to the right. As my mentor Andrey figured, there is an interesting addition to the laws: `(LeftVertex u) * (LeftVertex v) = (LeftVertex u) + (LeftVertex v)`

. Of course, the same holds for the right vertices.

By now, we agreed that first, I will focus on implementing adjacency maps for bipartite graphs (hey, didn’t I mention that Alga uses adjacency maps on the inside?). It doesn’t make much sense to make a separate algebraic representation, but I may do it if I find something interesting in it.

Now, the first task is to implement the conversion function, which I’m going to start right now. This implementation will simply ignore the edges between vertices of the same part.

```
fromGraph :: Graph (Either a b) -> Bipartite.AdjacencyMap a b
= undefined fromGraph
```

With this stub, my summer-long dive into Haskell begins!

]]>We would like to thank everyone who submitted ideas – this is a key part of being accepted into GSoC. Now, here’s the near term timeline:

**Today - March 25**: Potential student participants discuss application ideas with mentors**March 25 - April 9**: Students can submit applications**May 6**: Accepted student proposals announced

At this point, we’re looking for both students and extra mentors. We would like to assign at least two mentors to each project if possible, so the students get the support they deserve. Additional ideas for projects are still welcome!

There’s a lot of information on our Summer of Haskell page. If there are any students who are not sure where to begin, feel free to reach out to us directly!

]]>Last year, we were fortunate enough to join again, and we think the results greatly benefited the Haskell community. We are hoping to do the same for 2019. As far as we know, a really important part of our application to GSoC is the list of ideas we provide. For that, I would like to count on all of you.

If you are the maintainer or the user of a Haskell project, and you have an improvement in mind which a student could work on during the summer, please submit an idea here:

https://summer.haskell.org/ideas.html

For context, Google Summer of Code is a program where Google sponsors students to work on open-source projects during the summer. Haskell.org has taken part in this program from 2006 until 2015, and again in 2018. Many important improvements to the ecosystem have been the direct or indirect result of Google Summer of Code projects, and it has also connected new people with the existing community.

Projects should benefit as many people as possible – e.g. an improvement to GHC will benefit more people than an update to a specific library or tool, but both are definitely valid. New libraries and applications written in Haskell, rather than improvements to existing ones, are also accepted. Projects should be concrete and small enough in scope such that they can be finished by a student in three months.

]]>Google Summer of Code 2018 is officially over. The Haskell.org organisation had a very productive year with 17 accepted projects out of which 15 were successful. We would like to thank the students and mentors for the great summer, and, of course, Google for its generous support towards the open source community.

Before we get into the summary of this year, we’d like to bring attention to the fact that we will soon start preparing for GSoC 2019. This means we will be looking for:

*Project ideas*: Even if you are not interested in participating yourself, maybe you have some ideas about what a student could hack on to improve the Haskell ecosystem. If that is the case, please submit a PR against this repo or just shoot us an email.*Mentors*: If you are interested in mentoring a student throughout the summer, feel free to contact us. You do not need a specific project idea – anyone with some Haskell experience willing to help others is welcome.*Students*: If you are thinking about applying to Haskell.org next year, it’s never too early to look for interesting projects.

Please reach out to us if you are interested in any of the above!

Student: Krystal Maughan

Mentors: Chris Smith, Gabriel Gonzalez

Blog: https://kammitama5.github.io/

This project was successful. Things got off to a slow start, but once Krystal got going, she tackled some projects with a lot of impact and benefits for users. You can find a good overview in her blogpost.

Student: Chitrak Raj Gupta

Mentors: Andrey Mokhov, Moritz Angermann

This project unfortunately did not pass the first evaluation.

Student: alanas

Mentors: Matthew Pickering, Erik de Castro Lopo

This project was successful. It looks like deprecated exports will be arriving in GHC 8.8 thanks to alanas’s efforts this summer. He wrote a blogpost about his experience as well.

Student: Simon Jakobi

Mentors: Herbert Valerio Riedel, Alex Biehl

Blog: https://sjakobi.github.io/

This project was successful. An initial version of the `:doc`

command made it into GHC-8.6, and Simon made many improvements to the Haddock internals. You can read more about it in his blogpost.

Student: Abhiroop Sarkar

Mentors: Carter Schonwald, Ben Gamari

This project was successful. Because of the complexity of the compiler work relative to this students familiarity, the code hasn’t been merged in yet and still needs a lot of cleanup and iterating. However, Abhiroop intends to continue working on this project with the Haskell and GHC community for the next few months. You can read Abhiroop’s summary here.

Student: Gagandeep Bhatia

Mentors: Marco Zocca, Andika D. Riyandi

This project was successful. Together with Gagandeep, we made some changes to the goals of this project initially and decided to have him target existing libraries rather than doing a greenfield project. He ended up making a number of good contributions to the Data Haskell ecosystem, and the Frames library in particular. He also wrote a wrap-up which you can read here.

Student: Ningning Xie

Mentors: Richard Eisenberg

This project was successful. Ningning writes:

It was an excellent experience for me to complete GSoC 2018 with Haskell.org. During these three monthes, I got the chance to dive into the state of the art compiler for Haskell programming language, GHC, with the help from my mentor and the broader community.

I chose the project because dependent types are one of my major research interests. And indeed I gained a lot from it. Firstly, the project was challenging, and working on such a huge codebase sounded frightening, but I managed to make progress and get lots of fun from it. I have learned a lot during this summer, which includes not only Haskell skills, but also many design principles inside GHC.

More details are available in her in-depth report.

Student: Alexandre Moine

Mentors: Andrey Mokhov, Alois Cochard

Blog: https://blog.nyarlathotep.one/

This project was successful. Alexandre worked on a variety of tasks, including benchmarking, optimisations, testing and even correctness proofs. His blogpost has more details.

Student: Andreas Klebinger

Mentors: José Calderón, Joachim Breitner, Ben Gamari

This project was successful. Andreas posted this gist including some *very impressive* numbers. Some patches that he worked on this summer have already been merged into GHC, and it looks the bulk of his work will also be merged soon.

Student: Francesco Gazzetta (@fgaz)

Mentors: Mikhail Glushenkov, Edward Yang

This project was successful. Francesco delivered great work just like last year and it sounds like this will be merged into the Cabal library soon. He put together a final report here.

Student: Luke Lau

Mentors: Alan Zimmerman

Blog: https://lukelau.me/haskell/

This project was successful. Luke wrote a bit about the project here. About his experience, he writes:

I had very little “real world” Haskell experience before starting, and there’s a lot of stuff they don’t teach you in university. But both my mentor and the Haskell community were extremely helpful with getting me up to speed and answering my many questions. Especially the #haskell channel on Freenode! In a lot of IRC channels you can find yourself asking question and never being answered, but the people at the Haskell channel were very eager to help and explain/discuss lots of different topics.

Student: Shayan Najd

Mentors: Ben Gamari, Alan Zimmerman

This project was successful. Shayan made significant progress to the trees-that-grow fork of GHC, and has a lot of patches ready to merged and reviewed. The mentors are very positive about the approach. Shayan’s summary can be found here.

Student: Wisnu Adi Nurcahyo

Mentors: Tom Sydney Kerckhove, Jasper Van der Jeugt

Blog: https://medium.com/@nurcahyo

This project unfortunately did not pass the second evaluation.

Student: khilanravani

Mentors: Alp Mestanogullari

This project was successful. Khilan made several contributions to the Haskell Image Processing library. You can see some cool examples of the various algorithms in the his wrap-up blogpost.

Student: Zubin Duggal

Mentors: Ben Gamari, Gershom Bazerman, Joachim Breitner

This project was successful. Zubin’s patches have not been merged yet but should be in the next few months. His final report can be found here.

Student: Alexis Williams

Mentors: Herbert Valerio Riedel, Mikhail Glushenkov

This project was successful. Alexis contributed key features to Cabal’s new-build infrastructure and also fixed an impressive amount of bugs. She writes about her experience here.

Student: Andrew Knapp

Mentors: Sacha Sokoloski, Trevor L. McDonell, Edward Kmett, Alois Cochard

This project was successful. Despite being an *extremely* hard topic to tackle, Anrew was able to get some impressive preliminary results. These lay out a very good foundation for future work that could be very valuable to the Haskell community.

We would like to thank all the participants again for the great summer and we already look forward to the next one!

]]>We are happy to announce the 17 projects that have been accepted to participate in Google Summer of Code 2018 for the Haskell.org project.

We would like to thank Google for organizing the program, all students who applied for the quality proposals of course the mentors for volunteering to guide the projects!

Without further ado, here are the accepted projects:

- Visual Tools and Bindings for Debugging in Code World
- Help Hadrian
- Add support for deprecating exports
- Hi Haddock
- Improving the GHC code generator
- Crucible: A Library for In-Memory Data Analysis in Haskell
- Dependently Typed Core Replacement in GHC
- Benchmarking graph libraries and optimising algebraic graphs
- Improvements to GHC’s compilation for conditional constructs.
- Support for Multiple Public Libraries in a .cabal package
- Functional test framework for the Haskell IDE Engine and Language Server Protocol Library
- Native-Metaprogramming Reloaded
- Format-Preserving YAML
- Enhancing the Haskell Image Processing Library with State of the Art Algorithms
- Making GHC Tooling friendly
- Helping cabal new-build become just cabal build
- Parallel Automatic Differentiation

Student: Krystal Maughan

Mentors: Chris Smith, Gabriel Gonzalez

Blog: https://kammitama5.github.io/

Visual Debugging tools that will allow various ages to interact with and learn visually while tracing their bugs in Haskell.

Student: Chitrak Raj Gupta

Mentors: Andrey Mokhov, Moritz Angermann

Current build systems such as `make`

have a very complex structure and are difficult to understand or modify. Hadrian uses functional programming to implement abstractions to make codebase much more comprehensible. Build Rules are defined using Shake Library, and the results produced are much faster and scalable than current make based system. But the in-use implementation of Hadrian is still in development phase and not completely ready to be deployed. I believe that Hadrian will serve a huge assistance in increasing the productivity of Haskell developers. Therefore, the aim of my project will be to push Hadrian a few steps closer to deployment, so that the Haskell community can code with a bit more efficiency.

A recent Pull Request by Alp Mestanogullary has implemented a basic rule for binary distribution. Also, I have been able to figure out multiple sources of errors causing validation failures, and my Pull Request has brought the number of failures down significantly.

Hence, the major goals of my project will be to:

- Achieve ghc-quake milestone that is listed in Hadrian.
- Implement missing features in Hadrian.
- Build a more comprehensive documentation of Hadrian.

Student: alanas

Mentors: Matthew Pickering, Erik de Castro Lopo

Add support of deprecation pragmas within module exports. This would ease the transition between different versions of the software by warning the developers that the functions/types/classes/constructors/modules that they are using are deprecated.

Student: Simon Jakobi

Mentors: Herbert Valerio Riedel, Alex Biehl

Blog: https://sjakobi.github.io/

A long-standing issue with Haskell’s documentation tool Haddock is that it needs to effectively re-perform a large part of the parse/template-haskell/typecheck compilation pipeline in order to extract the necessary information from Haskell source for generating rendered Haddock documentation. This makes Haddock generation a costly operation, and makes for a poor developer experience.

An equally long-standing suggestion to address this issue is to have GHC include enough information in the generated `.hi`

interface files in order to avoid Haddock having to duplicate that work. This would pave the way for following use-cases and/or have the following benefits:

- Significantly speed up Haddock generation by avoiding redundant work.
- On-the-fly/lazy after-the-fact Haddock generation in cabal new-haddock and stack haddock for already built/installed Cabal library packages.
- Add native support for a :doc command in GHCi’s REPL and editor tooling (ghc-mod/HIE) similar to the one available in other languages (c.f. the Idris REPL or the Python REPL)
- Allow downstream tooling like Hoogle or Hayoo! to index documentation right from interface files.
- Simplify Haddock’s code base.

Student: Abhiroop Sarkar

Mentors: Carter Schonwald, Ben Gamari

This project attempts to improve the native code generator of GHC by adding support for Intel AVX and SSE SIMD instructions. This support would enable GHC to expose a bunch of vector primitive operations, which can be utilized to by various high performance and scientific computing libraries of the Haskell ecosystem to parallelize their code for free.

Student: Gagandeep Bhatia

Mentors: Marco Zocca, Andika D. Riyandi

*Note: this project was slightly adjusted from its proposed form after some discussion with the mentors and it will have a stronger focus on improving existing libraries.*

A typical workflow in interactive data analysis consists of :

- Loading data (e.g. a CSV on disk)
- Transforming the data
- Various data processing stages
- Storing the result in some form (e.g. in a database).

The goal of this project is to provide a unified and idiomatic Haskell way of carrying out these tasks. Informally, you can think of “dplyr”/“tidyr” from the R ecosystem, but type safe. This project aims to provide a library with the following features:

- An efficient data structure for possibly larger-than-memory tabular data. The Frames library is notable prior work, and this project may build on top of it (namely, by extending its functionality for generating types from stored data).
- A set of functions to “tidy”/clean the data to bring it to a form fit for further analysis, e.g. splitting one column to multiple columns (“spread”) or vice versa (“gather”).
- A DSL for performing a representative set of relational operations e.g. filtering/aggregation.

Student: Ningning Xie

Mentors: Richard Eisenberg

In recent years, several works (Weirich et al., 2017; Eisenberg, 2016; Gundry, 2013) have proposed to integrate dependent types into Haskell. However, compatibility with existing GHC features makes adding full-fledged dependent types into GHC very difficult. Fortunately, GHC has many phases underneath so it is possible to change one intermediate language without affecting the user experience, as steps towards dependent Haskell. The goal of this proposal is the replacement of GHC’s core language with a dependently-typed variant.

Student: Alexandre Moine

Mentors: Andrey Mokhov, Alois Cochard

Blog: https://blog.nyarlathotep.one/

A graph represents a key structure in computer science and they are known to be difficult to work with in functional programming languages. Several libraries are being implemented to create and process graphs in Haskell, each of them using different graph representation: Data.Graph from containers, fgl, hash-graph and alga. Due to their differences and the lack of a common benchmark, it is not easy for a new user to select the one that will best fit their project. The new approach of alga seems particularly interesting since the way it deals with graphs is based on tangible mathematical results. Still, it is not very user friendly and it lacks some important features like widely-used algorithms or edge labels.

Therefore, I propose to develop a benchmarking suite that will establish a reference benchmark for these libraries, as well as to enhance alga’s capabilities.

Student: Andreas Klebinger

Mentors: José Calderón, Joachim Breitner, Ben Gamari

While GHC is state of the art in many respects compilation of conditional constructs has fallen behind projects like Clang/GCC.

I intend to rectify this by working on the following tasks:

- Implement cmov support for Cmm
- Use cmov to improve simple branching code
- Use lookup tables over jump tables for value selection when useful.
- Enable expression of three way branching on values in Cmm code.
- Improve placement of stack adjustments and checks.

Student: Francesco Gazzetta (@fgaz)

Mentors: Mikhail Glushenkov, Edward Yang

Large scale haskell projects tend to have a problem with lockstep distribution of packages (especially backpack projects, being extremely granular). The unit of distribution (package) coincides with the buildable unit of code (library), and consequently each library of such an ecosystem (ex. amazonka) requires duplicate package metadata (and tests, benchmarks…).

This project aims to separate these two units by introducing multiple libraries in a single cabal package.

This proposal is based on this issue by ezyang.

Student: Luke Lau

Mentors: Alan Zimmerman

Blog: https://lukelau.me/haskell/

The Haskell IDE Engine is a Haskell backend for IDEs, which utilises the Language Server Protocol to communicate between clients and servers.

This projects aims to create a test framework that can describe a scenario between an LSP client and server from start to finish, so that functional tests may be written for the IDE engine. If time permits, this may be expanded to be language agnostic or provide a set of compliance tests against the LSP specification.

Student: Shayan Najd

Mentors: Ben Gamari, Alan Zimmerman

The goal is to continue on an ongoing work, utilising the Trees that Grow technique, to introduce native-metaprogramming in GHC. Native-metaprogramming is a form of metaprogramming where a metalanguage’s own infrastructure is directly employed to generate and manipulate object programs. It begins by creating a single abstract syntax tree (AST) which can serve a purpose similar to what is currently served by Template Haskell (TH), and the front-end AST inside GHC (HsSyn). Meta-programs could then leverage, much more directly, the machinery implemented in GHC to process Haskell programs. This work can also possibly integrate with Alan Zimmerman’s work on compiler annotations in GHC, and enable a better IDE support.

Student: Wisnu Adi Nurcahyo

Mentors: Tom Sydney Kerckhove, Jasper Van der Jeugt

Blog: https://medium.com/@nurcahyo

Sometime Stack (The Haskell Tool Stack) ask us to add an extra dependency manually. Suppose that we use the latest Hakyll that needs a `pandoc-citeproc-0.13`

which is missing in the latest stable Stack LTS. Stack asks us to add the extra dependency to solve this problem. Wouldn’t it be nice if Stack could add the extra dependency by itself?

Student: khilanravani

Mentors: Alp Mestanogullari

The project proposed here aims to implement different classes of Image processing algorithms using Haskell and incorporate the same to the existing code base of Haskell Image Processing (HIP) package. The algorithms that I plan to incorporate in the HIP package have vast applications in actual problems in image processing. Including these algorithms to the existing code base would help more and more users to really use Haskell while working on some computer vision problems and this would make Haskell (kind of) ahead in the race of with functional programming languages such as Elm or Clojure (since their image processing libraries are pretty naive). In this way, this project can substantially benefit the Haskell organization as well as the open source community. Some of the algorithms proposed here include the famous Canny edge detection, Floyd - Steinberg (Dithering) along with other popular tools used in computer vision problems.

Student: Zubin Duggal

Mentors: Ben Gamari, Gershom Bazerman, Joachim Breitner

GHC builds up a wealth of information about Haskell source as it compiles it, but throws all of it away when it’s done. Any external tools that need to work with Haskell source need to parse, typecheck and rename files all over again. This means Haskell tooling is slow and has to rely on hacks to extract information from GHC. Allowing GHC to dump this information to disk would simplify and speed up tooling significantly, leading to a much richer and productive Haskell developer experience.

Student: typedrat

Mentors: Herbert Valerio Riedel Mikhail Glushenkov

While much of the functionality required to use the `new-*`

commands has already been implemented, there are not-insignificant parts of the design that was created last year that remain unrealized.

By completing more of this design, I plan to help the `new-`

prefix go away and to allow this safer, cleaner system to replace old-style cabal usage fully by rounding off the unfinished edges of the current proposal.

Student: Andrew Knapp

Mentors: Sacha Sokoloski, Trevor L. McDonell, Edward Kmett, Alois Cochard

Automatic Differentation (AD) is a technique for computing derivatives of numerical functions that does not use symbolic differentiation or finite-difference approximation. AD is used in a wide variety of fields, such as machine learning, optimization, quantitative finance, and physics, and the productivity boost generated by parallel AD has played a large role in recent advances in deep learning.

The goal of this project is to implement parallel AD in Haskell using the `accelerate`

library. If successful, the project will provide an asymptotic speedup over current implementations for many functions of practical interest, stress-test a key foundation of the Haskell numerical infrastructure, and provide a greatly improved key piece of infrastructure for three of the remaining areas where Haskell’s ecosystem is immature.