- Cabal file support for HLS
- Implement Resolution Methods in HLS
- Goto Definition for Third-Party Libraries in HLS
- Teaching Weeder About Type Class Instances
- Standardize GHC’s Error Dump in JSON Format
- Maximally Decoupling Haddock and GHC
- Representing Pattern
- Improving Calligraphy
- Structured Errors and Error Codes for cabal-install

Contributor: Jana Chadt

Mentor: Fendor

Abstract:

The goal of this proposal is to provide cabal file support for Haskell Language Server. I have been working on the cabal plugin for Haskell Language server during various Hackathons since 2021, implementing formatting and code-completion of cabal files and I would like to be able to commit to working on the plugin full time this summer.

Contributor: Nathan Maxson

Mentor: Michael Peyton Jones

Abstract:

With “codeAction/resolve” and “codeLens/reslove” the language server protocol has added methods to allow language servers to delay some of the work it needs to do for codeActions and codeLens until it is actually needed, allowing the server significant savings in both memory and cpu usage. This proposal is to add both of these methods to the haskell-language-server, allowing plugins to call them at will. In addition I propose adding support for the resolve methods to some haskell language server’s plugins.

Contributor: Elodie Lander

Mentor: Zubin Duggal

Abstract:

Making goto definition work for third party libraries is of interest to me as a Haskell developer because it is a feature I would like to use in my Haskell development. In fact, it is the feature that might finally motivate me to use HLS in my own projects. My Haskell workflow has usually involved a lot of switching back and forth between my editor and Hackage documentation in the browser. I believe that being able to see third party library definitions in my editor would reduce this back and forth significantly and help increase my efficiency as a Haskell developer.

Contributor: Vasily Sterekhov

Mentor: Ollie Charles

Abstract:

A frequent complaint about Haskell is the lack of tooling. This proposal aims to contribute to improving the situation by addressing a particular limitation of Weeder, a tool for detecting dead code. In the process, this may involve proposing minor additions to hie files to GHC, which may benefit other similar projects working in the same area.

Contributor: Ben Bellick

Mentor: Aaron Allen

Abstract:

GHC is currently undergoing a long scale project to move to a more structured error representation by treating errors as values. An additional useful feature that can be made available is to dump a JSON representation of warnings/errors. An experimental implementation of this feature exists when GHC is invoked with -ddump-json, although this is an unfinished command which suffers from the following:

- it is non-standardized
- it does not leverage new structured error representation
- previous implementation issues led to a hard-coding of output to stdout
There is an opportunity to benefit consumers of GHC output and to improve Haskell tooling infrastructure. Some examples of possible use-cases for down stream consumers can be found here. Not all consumers of Haskell’s error messages intend on doing so via the GHC API, and such a standardized JSON output enables a larger set of developers to expand the error tooling in the Haskell ecosystem. I am also personally excited to help with this project because I love Haskell and want to make a contribution to one of its crowning achievements–GHC. I am especially interested in any improvements which enable outside consumers to better understand/process the internals of the compiler.

Contributor: Gregory Baimetov

Mentor: Laurent P. René de Cotret

Abstract:

In practice, development and usage of Haddock is strongly coupled to the internals of the Glasgow Haskell Compiler (GHC). One concrete example of this coupling is the fact that Haddock makes use of the GHC parser itself. Therefore, if Haddock was compiled using GHC version X, it might not be able to parse the source code of a Haskell program written for GHC version Y > X.

This strong coupling between GHC and Haddock slows down Haddock development and prevents Haddock from being better integrated in other tools, such as Hackage, the Haskell Language Server, or Hoogle.

Contributor: Saachi Kaup

Mentor: Alex McLean

Abstract:

Using Haskell’s advanced type system to map the structures in Tidal Cycles to the underlying shapes of Mandala art and produce beautiful visualisations.

Contributor: Dominic Mills

Mentor: Luis Morillo Najarro

Abstract:

Calligraphy, a tool for visualizing Haskell projects, faces the challenges of developing and maintaining Haskell tooling due to the constantly evolving nature of the language and its implementation in GHC.

In light of these challenges, the primary aim of this Summer of Haskell project is to enhance the Calligraphy tool to provide visualizations that are both simple and easy to use. This will be done by modularizing the Calligraphy tool into its various parts such as the calligraphy-gui, calligraphy-graphviz, calligraphy-cli, and calligraphy-fgl. In addition to keeping it up-to-date with GHC releases.

Contributor: Suganya Arun

Mentor: Gershom Bazerman

Abstract:

]]>The https://errors.haskell.org/ site provides an index that maps error codes in haskell tooling to documentation. GHC, ghcup, and stack have all begin to implement support for structured errors that have assigned codes. This project is to refactor the cabal codebase to also provide structured errors rather than mere strings, and also assign cabal errors corresponding codes that can be added to the error index.

In 2022, the program will be addressed to all newcomers of open source that are 18 years and older. GSoC will no longer be solely focused on university students or recent graduates - people that are at various stages of their career, recent career changers, self-taught, those returning to the workforce, etc., are welcome to join. Those changes should better fulfill the needs of open source communities and provide more flexibility to both projects and contributors.

Organizers are aware that not everyone can devote an entire summer to coding. Offered projects are available in multiple sizes: medium (~175 hours) and large (~350 hours). There’s an availability to join a 12-week program or extend the deadline - up to 22 weeks.

Are you working on a Haskell project, and you could use the help of a student during the summer? Consider contributing it as an idea here! Send a pull request to our github repository (example from 2020). If you just want to discuss a possible idea, don’t hesitate to get in touch with us or/and read through the student/contributor guide.

We encourage you to explore GSoC’s webpage, and you can learn more on the FAQ website. All the updates about this year’s GSoC edition can be found in this blogpost.

GSoC 2022: OSS projects, developed during Summer (from June to September/November) for newcomers that are 18 years and older and want to spend 175 - 350 hours on coding activities, with a mentor’s support.

]]>Despite that, all our 10 slots were successful! This is the first that has happened in the history of Haskell.org’s participation in the program. Some of these are high-profile and will benefit a lot of users in the ecosystem, which is super exciting.

**Enhanced figure support in pandoc**

Student: Aner Lucero

Mentors: tarleb

Student reportGoogle summer of code was a great way to expand my involvement with the haskell community and to test my knowledge working on one of haskell’s most used apps.

**Gradually Typed Hasktorch**

Student: Julius Marozas

Mentors: Torsten Scholak

Student report**Dhall bindings to TOML configuration language**

Student: Marcos Lerones

Mentors: Gabriella Gonzalez, Simon Jakobi

Student report**Haskell in CodeMirror 6**

Student: Olivian Cretu

Mentors: Chris Smith

Student repo**Fixing ihaskell-widgets**

Student: David Davó

Mentors: James Brock, Vaibhav Sagar

Student reportThree years ago, I started learning Haskell and functional programming. As I had recently started using Jupyter Notebooks in other projects, I wanted to try using them with Haskell to take notes and do the course homework. A few weeks in, I noticed I couldn’t use the widgets, but I didn’t give it much thought. Three years later, this summer, I’ve had the opportunity to fix it, while learning a lot in the process.

That’s what open source is about.

**TidalCycles API and editor plugin**

Student: Martin Gius

Mentors: Alex McLean

Student report**Haskell Language Server: Symbol Renaming**

Student: Oliver Madine

Mentors: Pepeiborra

Student reportWorking on the Haskell Language Server (HLS) was my first time using Haskell in production. While navigating through different areas of the tooling infrastructure, the community was supportive in helping me develop my understanding.

Specifically, my project involved exploring hie-bios and the GHC API to create a symbol renaming plugin. Overall, the work was engaging, and I was able to substantially improve my development skills with the help of my mentor!

**Support call hierarchy in Haskell Language Server**

Student: Lei Zhu

Mentors: Javier Neira, Pepeiborra

Student reportHaskell community is warm and friendly to everyone, no matter you are a beginner or an expert. This summer, I am more familiar with haskell-language-server and GHC itself. Thank haskell.org and GSoC for providing this opportunity!

**Visualization Libraries for ghc-debug**

Student: Ethan Tsz Hang Kiang

Mentors: Matthew Pickering

Student report**TOML Support in dhall-haskell**

Student: Julio Grillo

Mentors: Gabriella Gonzalez, Simon Jakobi

Student report

We hope that Google hosts the program in 2022; and in that case we plan to apply again. If you have ideas for projects that students could work on, we’ll be using the same format as the years before – this page has more information on how to submit an idea.

Thanks a lot to everyone involved!

]]>**SPECIALIZABLE GHC pragma**

Student: Francesco Gazzetta @fgaz

Mentors: Carter Schonwald, Andreas Klebinger, chessai

Student report**Add primops to expand the (boxed) array API**

Student: buggymcbugfix

Mentors: andrewthad, Andreas Klebinger, chessai

Student report**Build-integration and Badges for Hackage**

Student: Shubham Awasthi

Mentors: hvr, Gershom Bazerman

Student report**Building the Haskell Language Server and more**

Student: Luke Lau

Mentors: Alan Zimmerman, Pepe Iborra, Zubin Duggal

Student report**Custom Dataloader for Hasktorch**

Student: Andre Daprato

Mentors: Austin Huang, Adam Paszke, Torsten Scholak, Junji Hashimoto**Documentation generator for the Dhall configuration language**

Student: German Robayo

Mentors: Profpatsch, Gabriel Gonzalez, sjakobi

Student report**Finish the package candidate workflow for Hackage**

Student: Sitao Chen

Mentors: hvr, Gershom Bazerman

Student reportThis summer, I have participated in Google Summer of Code with Haskell org and worked on Hackage candidate UI and workflow. Without previous experience in open source development, I was able to grasp a large codebase and its structure in a short period with the help of my mentors. Besides, I got a chance to learn about how to make API calls and how to improve UI using Haskell in a formal setting. This experience helps me have a better understanding of packages workflow management and web services in Haskell. I wish I can contribute again in the future!

**Functional Machine Learning Algorithms for Music Generation**

Student: Elizabeth Wilson

Mentors: Alex McLean, Austin Huang, Torsten Scholak

Student report**Multiple Home Packages for GHC**

Student: fendor

Mentors: Zubin Duggal, John Ericson, Matthew Pickering

Student reportHaskell IDE Engine was the first open source project I ever contributed to, and over time, it became of a project of passion for me. Over the months I dove deeper into Haskell tooling, until I got the chance to work on GHC itself in this year’s Google Summer of Code! I worked on this project to improve the tooling situation for Haskell, as well as improving the IDE experience by implementing features needed by both.

The project itself proved to be challenging, mainly because of my unfamiliarity with the GHC code base. However, with the help of my helpful mentors, I was able to overcome the challenges and learned a lot about GHC. I am glad I had the chance to work on this project, although I did not accomplish everything I wanted to, yet.

**Number Field Sieves**

Student: Federico Bongiorno

Mentors: Sergey Vinokurov, Andrew Lelechenko

Student report**Optimising Haskell developer tool performance using OpenTelemetry**

Student: Michalis Pardalos

Mentors: Dmitry Ivanov, Matthew PickeringMy project was about adding support for opentelemetry tracing into ghcide, the core component of haskell-language-server. I had very little experience with open-source development, or the internals of haskell and ghc before this project and I can say for sure that this has changed. Aside from working on ghcide itself, I also had to submit patches to haskell-opentelemetry, implementing features necessary for this project. When the project was blocked by a ghc bug, I also took this as an opportunity to dive into ghc and fix it myself, which I found incredibly rewarding and consider a valuable experience.

Even though I ended up running out of time and not finishing everything I hoped for in the project, I can say for sure that it was a positive experience which I would absolutely recommend.

**Update stylish-haskell to use ghc-lib-parser**

Student: Beatrice Vergani

Mentors: Jasper Van der Jeugt, lukaszgolebiewski, Paweł Szulc

Student report

Google will be hosting GSoC again in 2021, and of course we plan to apply again. If you have ideas for projects that students could work on, we’ll be using the same format as the years before – this page has more information on how to submit an idea.

Thanks a lot to everyone involved!

]]>Haskell.org has been able to take part in this program in the past two years, and we’d like to keep this momentum up since it greatly benefits the community.

Google is not extremely open about what factors it considers for applications from organizations, but they have stated multiple times that a well-organized ideas list is crucial. For that, we would like to count on all of you again.

If you are the maintainer or a user of a Haskell project, and you have an improvement in mind which a student could work on during the summer, please submit an idea here:

https://summer.haskell.org/ideas.html

For context, Google Summer of Code is a program where Google sponsors students to work on open-source projects during the summer. Haskell.org has taken part in this program in 2006-2015, and 2018-2019. Many important improvements to the ecosystem have been the direct or indirect result of Google Summer of Code projects, and it has also connected new people with the existing community.

Projects should benefit as many people as possible – e.g. an improvement to GHC will benefit more people than an update to a specific library or tool, but both are definitely valid. New libraries and applications written in Haskell, rather than improvements to existing ones, are also accepted. Projects should be concrete and small enough in scope such that they can be finished by a student in three months. Past experience has shown that keeping projects “small” is almost always a good idea.

]]>Unfortunately; this summary is less successful – I meant to contact the students immediately after the summer, but that mail never went through and I failed to follow up on it – my apologies.

In either case, I still wanted to list the successful projects here for posterirty. I reached out to the students again and will be updating this post with more information and quotes as they get back to me.

**A language server for Dhall**

Student: Frederik Ramcke

Mentors: Luke Lau, Gabriel Gonzalez**A stronger foundation for interactive Haskell tooling**

Student: dxld

Mentors: Alan Zimmerman, Matthew Pickering**Automated requirements checking as a GHC plugin**

Student: Daniel Marshall

Mentors: Chris Smith, chessai, Alphalambda**Extending Alga**

Student: O V Adithya Kumar

Mentors: Andrey Mokhov, Jasper Van der Jeugt, Alexandre Moine**Extending Hasktorch With RNNs and Encoder-Decoder**

Student: AdLucem

Mentors: Austin Huang, Junji Hashimoto, Sam Stites**Functional Machine Learning with Hasktorch: Produce Functional Machine Learning Model Reference Implementations**

Student: Jesse Sigal

Mentors: Austin Huang, idontgetoutmuch, Junji Hashimoto, Sam Stites**Hadrian Optimisation**

Student: ratherforky

Mentors: Andrey Mokhov, Neil Mitchell**Implementing Chebyshev polynomial approximations in Haskell: Having the speed and precision of numerics with complex, non-polynomial functions.**

Student: Deifilia To

Mentors: tmcdonell, idontgetoutmuch, Albert Krewinkel**Improving Hackage Matrix Builder as a Real-world Fullstack Haskell Project**

Student: Andika Riyandi (Rizary)

Mentors: Herbert Valerio Riedel, Robert Klotzner**Improving HsYAML Library**

Student: Vijay Tadikamalla

Mentors: Herbert Valerio Riedel, Michał Gajda**Issue-Wanted Web Application**

Student: Rashad Gover

Mentors: Veronika Romashkina, Dmitrii Kovanikov**More graph algorithms for Alga**

Student: Vasily Alferov

Mentors: Andrey Mokhov, Alexandre Moine**Property-based testing stateful programs using quickcheck-state-machine**

Student: Kostas Dermentzis

Mentors: stevana, Robert Danitz**Putting hie Files to Good Use**

Student: Zubin Duggal

Mentors: Alan Zimmerman, Matthew Pickering**Upgrading hs-web3 library**

Student: amany9000

Mentors: Alexander Krupenkin, Thomas Dietert

Thanks to everyone involved!

]]>When you apply to Summer of Code, you write a proposal. The proposal is a document in which you describe your ideas on the chosen project. It should be a clear, detailed text with suggestions on every subtask. The proposal should also include a timeline, in which you estimate the time you intend to spend on each of those subtasks.

I chose this project for my summer. In my proposal, I drafted all the algorithms mentioned in the list and suggested a few more. I published this part of my proposal as a Github gist there.

I don’t suggest this gist as a complete example of a good proposal: it’s only a part of the document I submitted. You should also include some information about you, together with the timeline. Communication with your future mentors is also a significant part of the application.

However, as I mentioned in one of my previous posts, another student ended up doing the part suggested in the ideas list. So my task is to introduce bipartite graphs.

This task was my idea. I mentioned it in my proposal. I meant that finding maximum matchings in bipartite graphs should be easily implemented when we have algorithms for finding maximum flows in networks. Kuhn’s algorithm is an application of the Ford-Fulkerson algorithm, and the Hopcroft-Karp algorithm is an application of Dinic’s algorithm.

However, this option is not the best. Both algorithms have specialized implementations that work times faster. So my task for this summer was to introduce bipartite graphs and special functions for working with them.

I made four pull requests to Alga this summer. Each pull request represents a separate task and summarizes the work of several weeks.

Each PR contains the actual implementation, tests, and documentation. The whole
project is release-ready after merging each one of them. I put the tests in the
`test/`

directory. The documentation for each function and datatype precedes the
declaration. After release, it will compile to beautiful Haddock file like
this.

**Link to PR:** https://github.com/snowleopard/alga/pull/207

In this part, I defined the `Bipartite.AdjacencyMap`

datatype and added many
functions to work with adjacency maps.

The datatype represents a map of vertices into their neighbours. I defined it as two maps:

```
data AdjacencyMap a b = BAM {
leftAdjacencyMap :: Map.Map a (Set.Set b),
rightAdjacencyMap :: Map.Map b (Set.Set a)
}
```

The properties are based on the existing properties of graphs in Alga.

**Link to PR:** https://github.com/snowleopard/alga/pull/218

There is a folklore algorithm that checks if a given graph is bipartite. The task to implement this algorithm in Haskell was a little challenging for me.

I finished up with the following definition of the function:

`detectParts :: Ord a => AM.AdjacencyMap a -> Either (OddCycle a) (AdjacencyMap a a)`

It is known that a graph is bipartite if and only if it contains no cycles of odd length. This function either finds an odd cycle or returns a partition.

The implementation is so exciting that I wrote a whole
post
about it. I explained the reason I needed monad transformers there and made some
interesting benchmarks that pointed me to use the explicit `INLINE`

directive.

**Link to the unfinished PR**: https://github.com/snowleopard/alga/pull/226

Some families of graphs are bipartite: simple paths, even cycles, trees, bicliques, etc. The task is to provide a simple method to construct all those graphs.

The most exciting part of this task was to provide type-safe implementations. For example, only cycles of even length are bipartite. And speaking of paths, we should provide a method for constructing paths of vertices of two different types.

The `circuit`

definition for constructing graphs containing one even cycle is
simple:

`circuit :: (Ord a, Ord b) => [(a, b)] -> AdjacencyMap a b`

For the paths, I added a special type for alternating lists:

`data List a b = Nil | Cons a (List b a)`

So the `path`

definition is:

`path :: (Ord a, Ord b) => List a b -> AdjacencyMap a b`

As for now, the PR is almost merge-ready, only several small comments need fixes.

**Link to the unfinished PR**: https://github.com/snowleopard/alga/pull/229

This algorithm is the fastest one for maximum matchings in bipartite graphs. The implementation is rather straightforward.

However, there is an aspect of this PR I’d like to share there.

I implemented the following function:

```
augmentingPath :: (Ord a, Ord b) => Matching a b
-> AdjacencyMap a b
-> Either (VertexCover a b) (List a b)
```

Given a matching in a graph, it returns either an augmenting path for the matching or a vertex cover of the same size, thus proving that the given matching is maximum. As both outcomes can be easily verified, this helps to write perfect tests that ensure that the matching returned by my function is maximum indeed.

This PR still needs some work. The reason is that two different implementations behave weirdly on the benchmarks.

I wrote a lot of Haskell this summer. This gave me a lot of experience in this language. Although there’s still work to be done, I’m satisfied with the results I got.

I adore the way functional programs are developed. I was surprised to know how popular testing (QuickCheck) and benchmarking (Criterion) frameworks are organized. And preciseness of the documentation makes the work a lot easier.

]]>A graph is called bipartite if its vertices can be split into two parts in such way that there are no edges inside one part. While testing graph on tripartiteness is NP-hard, there is a linear algorithm that tests graph on bipartiteness and restores the partition.

The algorithm is usually one of the first graph algorithms given in any university course. The idea is rather straightforward: we try to assign vertices to the left or right part in some way, and when we get a conflict, we claim that the graph is not bipartite.

First, we assign some vertex to the left part. Then, we can confidently say that all neighbours of this vertex should be assigned to the right part. Then, all neighbours of this vertex should be assigned to the left part, and so on. We continue this until all the vertices in the connected component are assigned to some part, then we repeat the same action on the next connected component, and so on.

If there is an edge between vertices in the same part, one can easily find an odd cycle in the graph, hence the graph is not bipartite. Otherwise, we have the partition, hence the graph is bipartite.

There are two common ways of implementing this algorithm in linear time: using Depth-First Search or Breadth-First Search. We usually select DFS for this algorithm in imperative languages. The reason is that DFS implementation is a little bit simpler. I selected DFS, too, as a traditional way.

So, now we came to the following scheme. We go through the vertices in DFS order and assign them to parts, flipping the part when going through an edge. If we try to assign some vertex to some part and see that it is already assigned to another part, then we claim that the graph is not bipartite. When all vertices are assigned to parts and we’ve looked through all edges, we have the partition.

In Haskell, all computations are supposed to be *pure*. Still, if it was
*really* so, we wouldn’t be able to print anything to the console. And what I
find most funny about pure computations is that they are so lazy that there is
no pure reason to compute anything.

Monads are the Haskell way to express computations with *effects*. I’m not
going to give a complete explanation of how they work here, but I find
this one very nice and
clear.

What I **do** want to notice there is that while some monads, like `IO`

, are
implemented through some deep magic, others have simple and pure
implementations. So the entire computation in these monads is pure.

There are many monads that express all kinds of effects. It is a very beautiful and powerful theory: they all implement the same interface. We will talk about the three following monads:

`Either e a`

— a computation that returns value of type`a`

or throws an error of type`e`

. The behaviour is very much like exceptions in imperative languages and the errors may be caught. The main difference is that this monad is fully logically implemented in the standard library, while in imperative languages it is usually implemented by the operating system or virtual machine.`State s a`

— a computation that returns value of type`a`

and has an access to a modifiable state of type`s`

.`Maybe a`

. A`Monad`

instance for`Maybe`

expresses a computation that can be at any moment interrupted with returning`Nothing`

. But we will mostly speak of`MonadPlus`

instance, which expresses a vice versa effect: this is a computation which can be at any moment interrupted with returning a concrete value.

We have two data types, `Graph a`

and `Bigraph a b`

, first of them representing
graphs with vertex labels of type `a`

and second representing bipartite graphs
with left part labels of type `a`

and right part labels of type `b`

.

**A Word of Warning**: These are not Alga data types. Alga representation for
bipartite graphs is not yet released and there is no representation for
undirected graphs.

We also assume that we have the following functions.

```
-- List of neighbours of a given vertex.
neighbours :: Ord a => a -> AM.AdjacencyMap a -> [a]
-- Convert a graph with vertices labelled with their parts to a bipartite
-- graph, ignoring the edges within one part.
toBipartiteWith :: (Ord a, Ord b, Ord c) => (a -> Either b c)
-> Graph a
-> Bigraph b c
-- List of vertices
vertexList :: Ord a => AM.AdjacencyMap a -> [a]
```

Now we write the definition for the function we are going to implement.

```
type OddCycle a = [a]
detectParts :: Ord a => Graph a -> Either (OddCycle a) (Bigraph a a)
```

It can be easily seen that the odd cycle is at the top of the recursion stack in case we failed to find the partition. So, in order to restore it, we only need to cut everything from the recursion stack before the first occurrence of the last vertex.

We will implement a Depth-First Search, while maintaining a map of part
identifiers for each vertex. The recursion stack for the vertex in which we
failed to find the partition will be automatically restored with the `Functor`

instance for the monad we choose: we only need to put all vertices from the
path into the result on our way back from the recursion.

The first idea is to use the `Either`

monad, that fits perfectly well to our
goals. The first implementation I had was something very close to that. In
fact, I had five different implementations at some point to choose the best,
and I finally stopped at another option.

First, we need to maintain a map of effects — this is something about
`State`

. Then, we need to stop when we found a conflict. This could be either
`Monad`

instance for `Either`

or `MonadPlus`

instance for `Maybe`

. The main
difference is that `Either`

has a value to be returned in case of success
while `MonadPlus`

instance for `Maybe`

only returns a value in case we failed
to find the partition. As we don’t need a value because it’s already stored in
`State`

, we choose `Maybe`

. Now, we need to combine two monadic effects, so we
need monad transformers,
which are a way to combine several monadic effects.

Why had I chosen such complicated type? There are two reasons. The first is
that the implementation becomes very similar to one we have in imperative
languages. The second is that I needed to manipulate the value returned in case
of conflict to restore the odd cycle, and this becomes much simpler in `Maybe`

.

So, here we go now.

```
{-# LANGUAGE ExplicitForAll #-}
{-# LANGUAGE ScopedTypeVariables #-}
data Part = LeftPart | RightPart
otherPart :: Part -> Part
LeftPart = RightPart
otherPart RightPart = LeftPart
otherPart
type PartMap a = Map.Map a Part
type OddCycle a = [a]
toEither :: Ord a => PartMap a -> a -> Either a a
= case fromJust (v `Map.lookup` m) of
toEither m v LeftPart -> Left v
RightPart -> Right v
type PartMonad a = MaybeT (State (PartMap a)) [a]
detectParts :: forall a. Ord a => Graph a -> Either (OddCycle a) (Bigraph a a)
= case runState (runMaybeT dfs) Map.empty of
detectParts g Just c, _) -> Left $ oddCycle c
(Nothing, m) -> Right $ toBipartiteWith (toEither m) g
(where
inVertex :: Part -> a -> PartMonad a
= ((:) v) <$> do modify $ Map.insert v p
inVertex p v let q = otherPart p
| u <- neigbours v g ]
msum [ onEdge q u
{-# INLINE onEdge #-}
onEdge :: Part -> a -> PartMonad a
= do m <- get
onEdge p v case v `Map.lookup` m of
Nothing -> inVertex p v
Just q -> do guard (q /= p)
return [v]
processVertex :: a -> PartMonad a
= do m <- get
processVertex v `Map.notMember` m)
guard (v LeftPart v
inVertex
dfs :: PartMonad a
= msum [ processVertex v | v <- vertexList g ]
dfs
oddCycle :: [a] -> [a]
= tail (dropWhile ((/=) last c) c) oddCycle c
```

I’ll try to explain each of the first four scoped functions: this is the core of the algorithm.

`inVertex`

is the part of DFS that happens when we visit the vertex for the first time. Here, we assign the vertex to the part and launch`onEdge`

for every incident edge. And that’s the place where we hope to restore the call stack: if a`Just`

is returned from sum edge, we add`v`

to the beginning.`onEdge`

is the part that happens when we visit any edge. It happens twice for each edge. Here we check if the vertex on the other side is visited. If not, we visit it. Else we check whether we found an odd cycle. If we did, we simple return the current vertex as a singleton. The other vertices from the path are added at the way back from the recursion.`processVertex`

checks if the vertex is visited and runs DFS on it if not.`dfs`

runs`processVertex`

on all vertices.

That’s it.

When I first wrote the above code, `action`

was not explicitly inlined. Then,
when I was benchmarking different versions of `detectParts`

to select the best,
I noticed that on some graphs this version with transformers had a serious
overhead over the version with `Either`

. I had no idea of what was going on,
because semantically two functions were supposed to perform the same operations.
And it became even weirder when I ran it on another machine with another
version of GHC and didn’t notice any overhead there.

After a weekend of reading GHC Core code, I managed to fix this with one
explicit inline. At some point between GHC 8.4.4 and GHC 8.6.5 they changed the
optimizer in some way that it didn’t inline `action`

.

This is just a crazy thing about programming I didn’t expect to come through with Haskell. Still, it seems that the optimizers make mistakes even in our time and it is our job to give them hints of what should be done. For example, here we knew that the function should be inlined as it is in the imperative version, and that’s a reason to give GHC a hint.

When this patch is merged, I’m going to start implementing Hopcroft-Karp algorithm. I think the BFS part is going to be rather interesting, so the next blog post will come in a couple of weeks.

]]>The idea of the project was on the ideas list published earlier. Two of us were accepted for this project, the other one being Adithya Kumar and who will be doing the work described on the ideas list. He told me his GSoC blog will probably be here.

My task is to introduce bipartite graphs to Alga and that is what I am going to tell you about now.

There are three common ways to represent graphs in computing:

- Adjacency matrix
- Adjacency lists
- Edge lists.

All three of them have their advantages and disadvantages. The most commonly used is the adjacency lists approach: that is storing a list of neighbors for each vertex. In fact, I can think of only one common algorithm for which this approach is not perfect: it is Kruskal’s algorithm for finding the minimum spanning tree.

However, the problem is that feeding graphs formed this way to algorithms is
not always safe. For example, if the algorithm is designed for bidirectional
graphs, it may rely on the fact that if some vertex `u`

is in the list of
neighbors of some another vertex `v`

then `v`

is in the list of neighbors of
`u`

.

A traditional solution for functional programming would be to guarantee the
consistency of input data for the algorithm by taking a representation of the
graph that would not allow a wrong graph to be passed. That’s what we call
*type safety*.

Alga is a library that provides such a safe representation with a beautiful algebraic interpretation. It also has a nice set of algorithms out of the box. You can find the paper on Alga by its author here, I’m just going to provide some basics.

Consider the following definition for the graph data type:

```
data Graph a = Empty
| Vertex a
| Overlay (Graph a) (Graph a)
| Connect (Graph a) (Graph a)
```

The constructors mean the following:

`Empty`

constructs an empty graph.`Vertex v`

constructs a graph of single vertex labeled`v`

.`Overlay g h`

constructs a graph with sets of vertices and edges united from graphs`g`

and`h`

.`Connect g h`

does the same as`Overlay`

and also connects all vertices of`g`

to all vertices of`h`

.

One can easily construct a `Graph`

of linear size having a list of edges of the
desired graph. In fact, this approach may even save memory for dense graphs
comparing to adjacency lists. And this approach is surely *type safe* in the
sense described above. Comparing to adjacency lists, there is no problem with
an edge not present in the list of neighbours of another vertex. Another
possible problem with adjacency lists not present here is that an edge might
lead to a vertex with no associated adjacency list.

Why algebraic? Well, if we write down simple laws for these graphs we will see
that laws for `Connect`

and `Overlay`

operations are very similar to those for
multiplication and addition in a semiring, respectively.

This was just a brief description of Alga. There are many other parts not
covered here. One example is that `Graph`

might also be provided as a type
class rather than a data type. This approach is much more flexible.

An important part of Alga is providing different type-safe representations for different kinds of graph. For example, one for edge-labeled graphs was introduced last year.

Another option is to add a representation that *restricts* the set of possible
graphs. One example from the ideas list is to represent only acyclic directed
graphs. This is what Adithya will be doing. And my task for the first
evaluation period is to provide bipartite graphs.

We often meet bipartite graphs in real world: connections between entities of different kinds are common. For example, graph of clients and backends they use is bipartite. Another example I can think of is about content recommendation systems: graph of users and films or songs they like is bipartite, too.

There are many ideas on how to do so. For example, in my proposal I suggested an approach that seems to match Alga’s design:

```
data Bigraph a b = Empty
| LeftVertex a
| RightVertex b
| Overlay (Bigraph a b) (Bigraph a b)
| Connect (Bigraph a b) (Bigraph a b)
```

Here, `Connect`

only connects left vertices to the right. As my mentor Andrey
figured, there is an interesting addition to the laws:
`(LeftVertex u) * (LeftVertex v) = (LeftVertex u) + (LeftVertex v)`

. Of course,
the same holds for the right vertices.

By now, we agreed that first, I will focus on implementing adjacency maps for bipartite graphs (hey, didn’t I mention that Alga uses adjacency maps on the inside?). It doesn’t make much sense to make a separate algebraic representation, but I may do it if I find something interesting in it.

Now, the first task is to implement the conversion function, which I’m going to start right now. This implementation will simply ignore the edges between vertices of the same part.

```
fromGraph :: Graph (Either a b) -> Bipartite.AdjacencyMap a b
= undefined fromGraph
```

With this stub, my summer-long dive into Haskell begins!

]]>We would like to thank everyone who submitted ideas – this is a key part of being accepted into GSoC. Now, here’s the near term timeline:

**Today - March 25**: Potential student participants discuss application ideas with mentors**March 25 - April 9**: Students can submit applications**May 6**: Accepted student proposals announced

At this point, we’re looking for both students and extra mentors. We would like to assign at least two mentors to each project if possible, so the students get the support they deserve. Additional ideas for projects are still welcome!

There’s a lot of information on our Summer of Haskell page. If there are any students who are not sure where to begin, feel free to reach out to us directly!

]]>