I am neither a linguist nor a philosopher of language, so what I will say is naive and may be completely silly.
It seems to be common to divide up the task of analyzing language between syntax and semantics. Syntax determines how to classify linguistic strings into categories such as "sentence", "well-formed formula", "predicate", "name", etc. If the division is merely pragmatic, that's fine. But if something philosophical is supposed to ride on the division, we should be cautious. Concepts like "sentence" and "predicate" are ones that we need semantic vocabulary to explain—a sentence is the sort of thing that could be true or false, or maybe the sort of thing that is supposed to express a proposition. A predicate is the sort of thing that can be applied to one or more referring expressions.
If one wants syntax to be purely formal, we should see it as classifying permissible utterances into a bunch of formal categories. As pure syntactitians, we should not presuppose any particular set of categories into which the strings are to be classified. If we are not to suppose any specific semantic concepts, the basic category should be, I think, that of a "permissible conversation" (it may well be that the concept of a "conversation" is itself semantic—but it will be the most general semantic concept). Then, as pure syntactitians, we study permissible conversations, trying to classify their components. We can model a permissible conversation as a string of characters tagged by speaker (we could model the tagging as colors—we put what is spoken by different people in different colors). Then as pure syntactitians, we study the natural rules for generating permissible conversations.
It may well be that in the case of a human language, the natural generating rules for speakers will make use of concepts such as "sentence" and "well-formed formula", but this should not be presupposed at the outset.
Here is an interesting question: Do we have good reason to suppose that if we restricted syntax to something to be discovered by this methodology, the categories we would come up with would be at all the familiar linguistic categories? I think we are not in a position to know the answer to this. The categories that we in fact have were not discovered by this methodology. They were discovered by a mix of this methodology and semantic considerations. And that seems the better way to go to generate relevant syntactic categories than the road of pure syntax. But the road that we in fact took does not allow for a neat division of labor between syntax and semantics, since many of our syntactic categories are also natural semantic ones, and their semantic naturalness that goes into making them linguistic relevant.