I think this is fairly normal. I’m fine with a couple of moves, but if it’s more than a 5-move sequence, I’ll often be flipping back and forth between the problem diagram and the solution to get it clear in my head.
Similarly for theory books, I like to see two diagrams for each example, one with just the starting position and then one with the sequence of moves. When it says “in this position you should play <sequence of 15 moves>”, and all you have is a single diagram with all the moves on it, I can’t “see” the starting position unless I lay it out on a board.
Related: What do you see in your head when you read? I was surprised by the responses: generally, people see less than I expected, and I was reassured to find it’s not just me!