I want to come back to this as I don't think it's written very well... There's every chance the page will confuse you more than you were previously...

Recursion

What is recursion?

In the context of programming, recursion is the process of a function calling itself as part of its execution. It's very effective for data structures with a recursive structure. For example, in a binary search tree there isn't a difference in structure between the root/base of a tree and any node in the tree. You could 'pick off' any node and it would still be a valid tree. This holds true for linked lists as well; there's no structural difference between the head of a list and any other node in the list.

So what do we need?

You might like to think about the recursive process as two parts: the recursive step and the base case.

Recursive step

This is the logic of the program that continutes to recurse deeper into your structure. A recursive linked list function might perform some task and then call the function again on the next node in the list. A recursive binary tree function might call the same function again on its left and right nodes/subtrees.

Base case

The base case is the point at which we stop recursing. This is usually when we've fulfilled some condition of the function, or we reach the end of our list/tree (a NULL node).

This sounds confusing...

Don't worry! It may be confusing at first, but once you've seen it work, it becomes clear that there's not really any magic going on. So let's look at a quick example!

Finding the largest element in a list

Alright, we're going to create a recursive function that returns a pointer to the node with the largest number in a given list. Here's the structure of the nodes in the list, nothing fancy:

typedef struct lnode Lnode;

struct lnode {
    int data;
    Lnode* next;
};

And here's the a starting point for our function:

Lnode* findLargest(Lnode* head){
    Lnode* largest = NULL;
    
    return largest;
}

Not very exciting, is it? Well let's start thinking about it for a moment; what do we want to do? How can we decide if a node has the largest value in the list? Using the idea of recursion, we can consider each node having a pointer to another list. Then for a given node, we just need to work out whether its value is greater than the largest value in the list it points to. However, you may have noticed that we're solving this problem by using the problem itself: "To find the largest value in the list, we compare and return the largest value of the current node and the largest value in the list it points to". As we write a recursive function we can call the function and assume that it will return the correct value. Of course, it's up to us to make sure it does so! So let's define our recursive step and base case. If we have a non-NULL node pointer, we want to find the largest value between that node and the list it points to. If we have a NULL pointer, we simply return NULL (as there's no largest node in an empty list).

Lnode* findLargest(Lnode* head){
    Lnode* largest = NULL;
    
    if (head != NULL){
        // Find and compare the largest node
    }
    
    return largest;
}

Here we've set up our base case as being an empty list, and the recursive step is finding the largest node between the current node and the list it points to. But how do we do that? Well, We can assume that our function works, and call it on the current node's next pointer to retrieve the largest node in that list. We can then compare the values of the nodes and store the pointer to the larger one!

Lnode* findLargest(Lnode* head){
    Lnode* largest = NULL;
    
    if (head != NULL){
        Lnode* listLargest = findLargest(head->next);
        if (head->data >= listLargest->data){
            largest = head;
        } else {
            largest = listLargest;
        }
    }
    
    return largest;
}

Great! So now we're storing the pointer to the largest node in sublist into listLargest, and then we compare the value of that node with the current node to determine which one to return. There's a subtly here in the >=, which means that if there our two nodes in the list with the same value, the 'earlier' node (the one closest to the head) will be returned. Surprisingly, we're actually almost done! One last thing we need to take care of is if the listLargest value is NULL, which happens when we're working on the last node in the list. In this case, we can simply check for NULL as well.

Lnode* findLargest(Lnode* head){
    Lnode* largest = NULL;
    
    if (head != NULL){
        Lnode* listLargest = findLargest(head->next);
        if (listLargest == NULL || head->data >= listLargest->data){
            largest = head;
        } else {
            largest = listLargest;
        }
    }
    
    return largest;
}

And we're done! It may seem surprising, because it can feel like we didn't have solve the problem of finding the largest node in the sublist, but the trick here is that by writing the code to find the largest node in this list, we've solved the problem of finding it in the sublist! Here's an iterative version to compare to, see which one you like better!

Lnode* findLargest(Lnode* head){
    Lnode* largest = head;
    
    while (head != NULL){
        if (head->data > largest->data){
            largest = head;
        }
        head = head->next;
    }
    
    return largest;
}