I forgot to add that in my definition it is said
"x2 is the prefixed with {la'e} {zo} cmavo that represents the class of structure words."
I think la'e is unnecessary here.
This means that in {cmavo zo bai} the word {bai} is the symbol of the set of all cmavo in that selma'o. Or you can say that it has two meaning but since it is limited to Lojban grammar no ambiguity arises when describing outer world.
{zo bai poi valsi} is a word, {zo bai poi selma'o} is a selma'o.
Or alternatively you might still use {cmavo lu'i zo bai} but with the note that by design cmavo2 returns only that set of cmavo that adequately represents the class BAI, not any other set. So {".a", "be", "ci", "do", "fu"} and other sets except the BAI set no pasarán.
Actually the same problem is with letterals.
How to refer to the Lojbanic sound [b] ? {zo by} or {zoi lojb.b.lojb.}? but that's a letter, not a sound.