Mixed models and tagging

Lyren

There are model tags for merged/mixed models such as berry mix. Here are some thoughts on what could be done with these tags:

Should there be implications for these models? So for example berry mix implicating novelai + stable diffusion + zeipher f222 + r34 e4 and as another example to emphasize my point, anysmirk hentai scene mix would implicate anything (model) + smirkingface_(model) + stable diffusion + zeipher f222 + r34 e4.

Another option would be to have only the model mix tag in the posts. So for example, if a post has used berry mix, it should not any of the models mentioned above beside it.

Also, I don't know enough about the terminology or technicality of these *mix models. So another question is, should mix tags be tagged with merged model or not?

Reply

ANJU

over 2 years ago

I don't think I have much knowledge to offer a good opinion, but I will say this;

If a mix tag exists, then maybe we could make a wiki for that tag, add the models it uses, and only use the single tag for posts? I only say, because I don't like the idea of a post having 5 or 6 different model tags on it, because I think it causes confusion, even if people can see it tagged as merged model.

Having the mix tag along with the models it's comprised of being tagged seems redundant too.

Ideally, I'd want to see as few model tags as necessarily possible.

Reply

Dramorian

over 2 years ago

I would just stick with writing the merged models in the wiki and call it a day.

Reply

antlers anon

over 2 years ago

In my opinion model tags aren't even that useful in case of merged models. It's impossible to get the merge anyway unless the uploaded provides a recipe or a link. I even wanted to create some tags for my own mixes so that I don't have to add a recipe to each post. Something like mix2_(antlers_anon) and then add a recipe and maybe even a link to the wiki. I don't know if we want to encourage creation of tags that might end up having only a few posts though.

Anyway, I'm also against creating implications from mixes to their components. Technically they should implicate merged_model because that's what they are, but I'm not sure how much utility is there for tagging them as such.

Reply

Penance

over 2 years ago

I'm starting to think model tags are not useful anyway. The original intention was one post gets one model tag but now people are doing shit like "75% NovelAI + 25% (50% Waifu Diffusion + 50% F22)" or whatever. Model tags were not even reliable to begin with because the same model could have different versions. Now it's even worse.

Reply

ANJU

over 2 years ago

Do you think getting rid of model tags entirely is a viable option?

Reply

Penance

over 2 years ago

ANJU said:
Do you think getting rid of model tags entirely is a viable option?

I would prefer to but we'd need an alternative that doesn't currently exist. I'm not sure exactly what it would look like.

Reply

ANJU

over 2 years ago

Talulah said:
I would prefer to but we'd need an alternative that doesn't currently exist. I'm not sure exactly what it would look like.

I was thinking that if there was a way to take a post's model hash and get an identifiable name from it, I think that could be a good replacement. Maybe even make it another field on the AI Metadata section then there'd still be a way to attach model names without needing tags.

I can think of two issues from this idea though, assuming it's possible to implement;

How would it take into account multiple models used? Would it include every one of them used, or would it simply use the mix version name (if there is one)? Would it primarily count whichever one had majority use/influence (like your example of 75% NovelAI + 25% (50% Waifu Diffusion + 50% F22))? Would it just count whichever one was last used?

Searching. Currently, for the AI Metadata sections there's has:metadata, has:prompt, and has:seed that are searchable metatags. None of the other parameters are specifically searchable, but I imagine we would want to keep models searchable so this would need to be considered: If multiple models can/would be listed, how would they be searched? Having something like {{has:model}} wouldn't be specific enough to someone wanting to look up only images generated with NovelAI.

Reply

antlers anon

over 2 years ago

ANJU said:
I was thinking that if there was a way to take a post's model hash and get an identifiable name from it, I think that could be a good replacement. Maybe even make it another field on the AI Metadata section then there'd still be a way to attach model names without needing tags.

Irritating as it is, that's impossible. I'm not sure why but in my testing mixing models with the "add difference" method preserved the hash of the first model. Using "weighted sum" created a completely new hash. If we want models to be searchable then we need a tag for each model and mix and users tagging, creating and describing them appropriately.

We could have yet another field on the upload screen with model autocompletion based on metadata hash that would display descriptions tied to them taken from wiki. Description would also need to change into an input if the hash does not exist in the database. Doing all this looks like a lot of work though so I don't expect it to happen.

Middle ground would be a simple input for model information. But this would only be useful for people trying to recreate the image and not searchable in any way.

Reply

ANJU

over 2 years ago

antlers_anon said:
Irritating as it is, that's impossible. I'm not sure why but in my testing mixing models with the "add difference" method preserved the hash of the first model. Using "weighted sum" created a completely new hash. If we want models to be searchable then we need a tag for each model and mix and users tagging, creating and describing them appropriately.

That's unfortunate.

I still believe it's a relatively small handful of users that actually care about tagging, creating and describing models properly. In fact, one of the problems I see is that a lot of uploads don't get even one model tag added (which is either due to laziness, or the image is from somewhere where the metadata got removed) so having AIBooru properly having everything idenfitied isn't possible anyway, I guess.

I did make a model tag listing sometime ago, and have been trying to pretty the page up by organizing it better on the wiki sandbox. I suppose we could have the mix models also sub-list what models they specifically use too, but I'm not sure yet.

antlers_anon said:
We could have yet another field on the upload screen with model autocompletion based on metadata hash that would display descriptions tied to them taken from wiki. Description would also need to change into an input if the hash does not exist in the database. Doing all this looks like a lot of work though so I don't expect it to happen.

I think it would be possible to implement, though if you say that sometimes the hash can stay the same, despite changing methods/weighting then I don't know how useful this would be.

antlers_anon said:
Middle ground would be a simple input for model information. But this would only be useful for people trying to recreate the image and not searchable in any way.

If someone did want to add that info, I feel they'll likely either make a comment or add it as a commentary on the post. That's what I've seen happen on occasion now.

Reply

Penance

over 2 years ago

ANJU said:
I was thinking that if there was a way to take a post's model hash and get an identifiable name from it, I think that could be a good replacement. Maybe even make it another field on the AI Metadata section then there'd still be a way to attach model names without needing tags.
I can think of two issues from this idea though, assuming it's possible to implement;
How would it take into account multiple models used? Would it include every one of them used, or would it simply use the mix version name (if there is one)? Would it primarily count whichever one had majority use/influence (like your example of 75% NovelAI + 25% (50% Waifu Diffusion + 50% F22))? Would it just count whichever one was last used?
Searching. Currently, for the AI Metadata sections there's has:metadata, has:prompt, and has:seed that are searchable metatags. None of the other parameters are specifically searchable, but I imagine we would want to keep models searchable so this would need to be considered: If multiple models can/would be listed, how would they be searched? Having something like {{has:model}} wouldn't be specific enough to someone wanting to look up only images generated with NovelAI.

We already have a search page for everything else. The has:* metatags were just for convenience. Perhaps someday we'll get metatags for those as well.

The idea, I guess, would be have a table of models and somehow do lookups such that something like model:novelai will search for all model hashes that use NovelAI (in part or in whole, perhaps with some way to exclude mixes or non-mixes). However that doesn't seem particularly reliable either because of the bafflingly stupid way these model hashes are generated. Apparently it's generated by seeking to 0x10000 in the ckpt file, reading 0x10000 bytes, taking the SHA256 of that, and returning the first 8 characters of the hash. That makes it borderline worthless as a unique identifier, as those 65536 bytes end up being identical in many cases. There are a few open issues to fix hashes, but it doesn't seem anyone has settled on a solution. It also would not retroactively fix anything.

I don't really know if there is any good solution for it at present. I'm open to suggestions if anyone has an idea.

Reply

swordfish

over 2 years ago

Not sure if this is helpful, but the webui has a option that can be enabled: "Add model name to generation information" which is off by default. ("Add model hash to generation information" is enabled by default)

If you suggest to enable this on the upload page and store the model name + hash from all uploads it should be possible to get most model hashes matched after some time as just a few people using this should be enough to collect hashes.
I think this uses the filename of the ckpt though i haven't tested it yet, so it might be a good idea to store all "seen" name + hash combinations and use the most used one as name?

Maybe this would be a good base to think of a possible solution. Maybe it would even be possible to add links to the model downloads for the most popular ones.

Reply

user 1154

over 2 years ago

swordfish said:
Not sure if this is helpful, but the webui has a option that can be enabled: "Add model name to generation information" which is off by default. ("Add model hash to generation information" is enabled by default)
If you suggest to enable this on the upload page and store the model name + hash from all uploads it should be possible to get most model hashes matched after some time as just a few people using this should be enough to collect hashes.
I think this uses the filename of the ckpt though i haven't tested it yet, so it might be a good idea to store all "seen" name + hash combinations and use the most used one as name?
Maybe this would be a good base to think of a possible solution. Maybe it would even be possible to add links to the model downloads for the most popular ones.

If I am understanding you correctly, you are stating that if we had the names that went with the hashes, we could simply tag the models based on the included hashes. The fundamental problem however, is that the hash listed in the AI Metadata is not in fact a unique identifier. Merged models and models being trained can still retain the same hash as others due to how how small and selective the data being hashed is. Any solution that involves making the included hashes actually unique (or even just more unique) would require upstream changes from the webUIs and GUIs that are including said hashes to begin with.

In short, we can't rely on the current included metadata or output image to identify the model(s) used. Self tagging the model(s) used is the only way to get a name to a model used reliably.

As for whether merged models should simply be called by their common name or list all the models in the mix, I don't know. Nutmegmix uses six models though, so that will be a long model tag list if anyone decides to use it here.

Edit: Actually after thinking about it, individual tags for merged models sound like a good idea if people have to self-tag their uploads anyway. Like antlers anon said above, a list of models with a merged_model tag doesn't really help much, because it says nothing about how those models were used.

For example, two images could use the same models in their mix, same seed, prompt, etc. and appear identical to AIBooru from the tags to the metadata, but be completely different depending on how those models were merged. berry_mix and blackberry_mix both use the same constituent models (novelai, stable_diffusion, zeipher_f222, r34_e4) for a real world example of this happening. Furthermore, berry_mix already gives more information in a single tag than the 5 other tags that are intended to be used instead, and at the same time differentiates posts that may use blackberry_mix which would be intended to use the same 5 tags as berry_mix. Even if berry_mix's wiki isn't filled out, it still shows viewers how the image was generated and what to look up for more. Finally, while it could lead to a lot of rarely used model tags that would only get a few posts, it still would likely be dwarfed by the amount of artist tags being created for every self-uploader on here. Which isn't a knock on self-uploaders, just showing how many rarely used tags are getting created all the time for good reason already anyway.

Updated over 2 years ago

Reply

antlers anon

over 2 years ago

Did we reach a conclusion here? I slowly started retagging my posts to antlersmix_2 etc and removing merged_model from them. My opinion is we should remove merged_model tag altogether since we have modeltags:>1 and if the models aren't stated, just knowing that it was a merge is pointless. In case or merges that have their own tags like my mixes, berrymix etc, this information should be included in the wiki and not depend on the uploader knowing if what they're using is a merge, so again, the tag serves no purpose.

Reply

Lyren

over 2 years ago

antlers_anon said:
Did we reach a conclusion here? I slowly started retagging my posts to antlersmix_2 etc and removing merged_model from them. My opinion is we should remove merged_model tag altogether since we have modeltags:>1 and if the models aren't stated, just knowing that it was a merge is pointless. In case or merges that have their own tags like my mixes, berrymix etc, this information should be included in the wiki and not depend on the uploader knowing if what they're using is a merge, so again, the tag serves no purpose.

About removing merged model: What about cases like x y grid? These kinds of images can have many combinations of modeltags: search and merged model search.

Reply

antlers anon

over 2 years ago

Lyren said:
About removing merged model: What about cases like x y grid? These kinds of images can have many combinations of modeltags: search and merged model search.

One can just exclude x_y_grid when searching but I see your point. So maybe we should have implications from all of the model tags that are merges to merged_model? The only question is if anyone cares enough find this information and create implications / tag the posts. I don't really mind either way, I'd just like to have a clear rule how I should tag my uploads.

Reply

Ocean3

over 2 years ago

antlers_anon said:
Did we reach a conclusion here? I slowly started retagging my posts to antlersmix_2 etc and removing merged_model from them. My opinion is we should remove merged_model tag altogether since we have modeltags:>1 and if the models aren't stated, just knowing that it was a merge is pointless. In case or merges that have their own tags like my mixes, berrymix etc, this information should be included in the wiki and not depend on the uploader knowing if what they're using is a merge, so again, the tag serves no purpose.

I tend to agree for the most part, and this goes along with what others mentioned in the Discord server. merged model posts tend to list each model tag used regardless, even in the case of x/y grids. They are also visually represented on the image if clearly stated.

One thing I thought to mention is cases where (A) model is used for generation while (B) model is used when upscaling, this is something more suited to be listed in commentary however. Something to note as another example of merged model use nonetheless.

Reply