Note: I understand both what MTU and MSS do so I am not asking about their function here. I understand that when a TCP connection is being established, the MSS is exchanged and it dictates the maximum size of the segment (without TCP and other headers) that one device can send to another.
I also understand that devices have MTU (which works on both layer 2 and layer 3) but for simplicity, it's the maximum size of the packet (ethernet payload) that can be sent or received over a wire.
My question is, why do we need both? More specifically, why can't we just rely on our MTU to ensure that we're not sending packets that are too large? There's also PMTUD (Path MTU Discovery) which allows the devices to discover whether there are any lowered MTU values along the path.
If there are, the ICMP Fragmentation Needed message is sent by the device and the receiving device sends smaller packets to accomodate for the lower MTU in the path.
So what's the significance of MSS, then? Why can't we use and rely just the MTU? Why do we need both?