How is it possible that the default implementations of Haskell's Bifunctor first and second compose?

I've been studying the category theory for about three months now. Recently, I felt surprised with the definition of Haskell's Bifunctor:

class Bifunctor f where

    // type signatures
    bimap :: (a -> c) -> (b -> d) -> f a b -> f c d
    first :: (a -> c) -> f a b -> f c b
    second :: (b -> d) -> f a b -> f a d

    // default implementations
    bimap g h = first g · second h // <--- surprising composition
    first g = bimap g id
    second h = bimap id

I felt surprised that first g would compose with second h . I felt confused because...

The return type of second h is f a d .
The argument of first g is f a b .
How can f a d go into f a b ?

This is how I reasoned through it. I started with the default implementation of bimap , which has the surprising composition, and used equational reasoning (of a kind) to convert the composition into a type signature. The final type signature has two unknown types x and y .

// default implementation of `bimap`
bimap g h = first g · second h

// replace `g` and `h` with their type signatures
bimap (a -> c) (b -> d) :: first (a -> c) · second (b -> d)

// replace `first (a -> c)` with its return type
// using the types from the bimap scope
bimap (a -> c) (b -> d) :: (f a x -> f c x) · second (b -> d)

// replace `second (b -> d)` with its return type
// using the types from the bimap scope
bimap (a -> c) (b -> d) :: (f a x -> f c x) · (f y b -> f y d)

I noticed that my substitution left x and y as unknown types. That is because first ignores the b that bimap receives, and second ignores the a that bimap receives. At this piont, all we know from the type signatures of first and second is that neither x nor y change. I decided to try to infer the types of x and y based on the assumption that first and second compose.

Here is how I did the manual type inference.

// write the target signature for bimap from its definition
bimap :: (a -> c) -> (b -> d) -> f a b -> f c d

// write the target signature after partial application with `g` and `h`
bimap (a -> c) (b -> d) :: f a b -> f c d

// notice that the target signature's returns type is the function (f a b -> f c d)

// use the target return type to infer the unknown typse
bimap (a -> c) (b -> d) 
    // number the unknown types
    :: (f a x1 -> f c x2) · (f y1 b -> f y2 d)

    // `x2` must be `d` to produce the target (f c d) 
    :: (f a x1 -> f c d) · (f y1 b -> f y2 d)

    // `y1` must be `a` to produce the target (f a b) 
    :: (f a x1 -> f c d) · (f a b -> f y2 d)

    // `y2` must be `a` because it is the same type as y1
    :: (f a x1 -> f c d) · (f a b -> f a d)

    // `x1` must be `d` because it is the same type as x2
    :: (f a d -> f c d) · (f a b -> f a d)

    // perform the composition
    :: (f a b -> f c d)

Lo and behold the composition lands at the target return type.

The type inference demonstrated that, in order for the composition to work, the implementations of first and second must adhere to these constraints:

first must not change its second argument's type
first must change a into c
second must not change its first argument's type
second must change b into d

With some more equational reasoning (of a type) we can see that the default implementations of first and second adhere to those constraints.

first g = bimap g id

// replace `bimap g id` with its type
first g :: (a -> c) -> (x -> x) -> f a x -> f c x

second h = bimap id

// replace `bimap id h` with its type
second h :: (x -> x) -> (b -> d) -> f a b -> f c d

Result: I have convinced myself that first and second do compose.