A transaction consists of a Request, any non-final (1xx) Responses received, and a final Response (2xx, 3xx, 4xx, 5xx, or 6xx), as well as the acknowledgements of the Responses (ACK or PRACK), except for ACKs to 2xx Responses. For example:
SIP peer A sends an INVITE Request to SIP peer B SIP peer B returns a Response of 100 TRYING; this is a non-final Response, so the transaction is not completed yet SIP peer B returns 200 OK (a final response), accepting the invitation; this completes the transaction
Basically, one complete Request-Response.
A dialog is just a series of transactions between two SIP peers. The purpose of a dialog is to setup, possibly modify, and then teardown a session. Hence the name Session Initiation Protocol. Since there could be many dialogs in progress between two SIP peers at any time (e.g. there could be many simultaneous calls in progress between two SIP servers), dialogs are identified by the From, To, and Call-ID fields in the header. So if SIP peer A gets two BYE Requests at the same time, it can look at these fields to determine which dialog they belong to.
A typical set of transactions you might see in a dialog could include:
SIP peer A invites SIP peer B to a session and suggests a certain codec, but does not include authentication and so is rejected SIP peer A again invites SIP peer B to a session, this time supplying authentication, and the invitation is accepted SIP peer B sends an invitation to change the codec used, and it is accepted SIP peer A ends the session
A session is just a media stream (e.g. audio or video) flowing between peers, usually consisting of RTP (and possibly RTCP) packets. For example, if SIP is used to make a voice call, the session is the voice data that is sent between endpoints
To answer the question do you need all three together, you need transactions and dialogs in order to create sessions, and sessions are the whole point of the protocol
Here is a link to a thread that contains examples of dialogs and transactions