In the context of programming, recursion is the process of a function calling itself as part of its execution. It's very effective for data structures with a recursive structure. For example, in a binary search tree there isn't a difference in structure between the root/base of a tree and any node in the tree. You could 'pick off' any node and it would still be a valid tree. This holds true for linked lists as well; there's no structural difference between the head of a list and any other node in the list.

You might like to think about the recursive process as two parts: the recursive step and the base case.

This is the logic of the program that continutes to recurse deeper into your structure. A recursive linked list function might perform some task and then call the function again on the next node in the list. A recursive binary tree function might call the same function again on its left and right nodes/subtrees.

The base case is the point at which we stop recursing. This is usually when we've fulfilled some condition of the function, or we reach the end of our list/tree (a NULL node).

Don't worry! It may be confusing at first, but once you've seen it work, it becomes clear that there's not really any magic going on. So let's look at a quick example!

Alright, we're going to create a recursive function that returns a pointer to the node with the largest number in a given list. Here's the structure of the nodes in the list, nothing fancy:

typedef struct lnode Lnode; struct lnode { int data; Lnode* next; };

And here's the a starting point for our function:

Lnode* findLargest(Lnode* head){ Lnode* largest = NULL; return largest; }

Not very exciting, is it? Well let's start thinking about it for a moment; what do we want to do?
How can we decide if a node has the largest value in the list? Using the idea of recursion, we can
consider each node having a pointer to *another* list. Then for a given node, we just need to
work out whether its value is greater than the largest value in the list it points to. However,
you may have noticed that we're solving this problem by using the problem itself: "To find the largest
value in the list, we compare and return the largest value of the current node and the largest value in
the list it points to". As we write a recursive function we can call the function and *assume*
that it will return the correct value. Of course, it's up to us to make sure it does so! So let's
define our recursive step and base case. If we have a non-NULL node pointer, we want to find the
largest value between that node and the list it points to. If we have a NULL pointer, we simply
return NULL (as there's no largest node in an empty list).

Lnode* findLargest(Lnode* head){ Lnode* largest = NULL; if (head != NULL){ // Find and compare the largest node } return largest; }

Here we've set up our base case as being an empty list, and the recursive step is finding the
largest node between the current node and the list it points to. But how do we do that? Well,
We can *assume* that our function works, and call it on the current node's next pointer to
retrieve the largest node in that list. We can then compare the values of the nodes and store
the pointer to the larger one!

Lnode* findLargest(Lnode* head){ Lnode* largest = NULL; if (head != NULL){ Lnode* listLargest = findLargest(head->next); if (head->data >= listLargest->data){ largest = head; } else { largest = listLargest; } } return largest; }

Great! So now we're storing the pointer to the largest node in sublist into
`listLargest`

, and then we compare the value of that node with the current
node to determine which one to return. There's a subtly here in the `>=`

, which
means that if there our two nodes in the list with the same value, the 'earlier' node (the one
closest to the head) will be returned. Surprisingly, we're actually almost done! One last
thing we need to take care of is if the `listLargest`

value is NULL, which happens when we're
working on the last node in the list. In this case, we can simply check for NULL as well.

Lnode* findLargest(Lnode* head){ Lnode* largest = NULL; if (head != NULL){ Lnode* listLargest = findLargest(head->next); if (listLargest == NULL || head->data >= listLargest->data){ largest = head; } else { largest = listLargest; } } return largest; }

And we're done! It may seem surprising, because it can feel like we didn't have solve the problem of finding the largest node in the sublist, but the trick here is that by writing the code to find the largest node in this list, we've solved the problem of finding it in the sublist! Here's an iterative version to compare to, see which one you like better!

Lnode* findLargest(Lnode* head){ Lnode* largest = head; while (head != NULL){ if (head->data > largest->data){ largest = head; } head = head->next; } return largest; }